Inferring conversational
functions in Japanese
discourse with
Discourse Marker Complex
Eiji Tomida & Shunichi Maruno
Faculty of Human-Environmental Studies,
Kyushu University
Growing interest in Discourse Process
Many social and cognitive scientists have
been interested in discourse processes.
Observing discourse, we can find important
interactions for socially-shared reasoning.
What kind of utterance facilitates reasoning?
How a new idea emerge in conversation?
In most studies in psychology and other
social sciences, they analyze discourse
data manually. For example…
Manual coding procedure
1. Construct a coding scheme.
2. Using the scheme, 2 independent coders
assign one of these categories to a
conversational turn.
3. The total numbers of each assigned category
are calculated with respect to each participant.
4. The frequencies of these categories are
compared with other variables.
Some categories in a coding scheme
(Tomida & Maruno, 2005)
Functions
Counterargument
Doubt
Descriptions
Providing one's own ideas which are in
opposition to another member’s ideas.
Doubting certainty what another member said.
Interpretation Interpreting what another member means in
his/her previous utterance.
Confirmation Making sure whether one’s own understanding
of another member’s utterance is correct or not.
Explanation
Adding more detailed explanation for one’s
previous utterance.
However…
Methodological Problems
Relatively low reliability
Time consuming
Automatic coding system is needed.
What index can be utilized for the
automatic coding system?
Possible index for functional
inference
Discourse Marker (Schiffrin, 1987)
Single words or lexicalized phrases that are
supposed to have a function of organizing
discourse structure.
Example: ‘and’, ‘but’, ‘because’, ‘now’, ‘then’,
and ‘I mean’ etc.
“By the way” signals the start of a digression.
“Anyway” signals the return from a digression.
Limitation of the discourse maker
 Discourse marker is inductively assumed as
index to signal a specific function.
 However, a typical marker is not always
accompanied with all utterances that surely have
such a function.
→Not enough for accurate detection
 If many different markers are combined, more
accurate and more robust inference system will
be possible
Concept of
Discourse Marker Complex
Original
discourse marker
Discourse Marker
Complex
Number of word
One or a few
More than 10 or so
Form of marker
Word or phrase
Conditional
expression
with word or phrase
By what
marker’s
function is
determined
Theoretical
assumption
Empirical
examination
Corpus construction
Participants: Japanese College students
43 participants, divided into 10 groups.
Discussion: 30 min.
Task: To jointly construct a “naïve model”
which explains causal mechanisms of
Japanese teenager’s aggression.
Discussions were transcribed and tagged.
Speaker’s name
Utterance function
DMC construction process
(1) Explore candidate words for DMC, referring to
manually coded utterances.
(2) Calculate coverage rates of the candidate
words.
(3) Construct a DMC, combining all the candidate
words. The constructed DMC is repeatedly
examined and modified.
○Analysis tool: HK-Coder (Higuchi, 2001)
(internally, also MySQL and Chasen are used)
Exploring candidate words for DMC
Group
Utterance
No.
Ss
No.
J2
45
28
Counter- でもさー(But)、なんか、人を刺
argument したらいけん、殺したらいけんって
言うのはさ、想像力とかそういう段
階じゃない気がする。
G1
66
9
Counter- ちょっと違うんよね(My opinion is a
argument little bit different from yours)。俺の
経験から言うとね,キレるというの
は,いきなりスコーンと飛んでしま
う・・・。
C3
161
14
Counter- いや(No),それは我慢できないほ
argument どのストレスが来たことが無いから
なんじゃない?(isn’t it?).
Function
Utterance Content
Results
We have constructed DMCs for:
Counter-argument
Confirmation
Also found some categories cannot be
distinguished from each other.
Counter-argument & Explanation
When people make a counter-argument, they
usually add detailed reasons for being against.
An abbreviated sample of DMC for
counter-argument
<*というか>
or ( いや or いやいや )
or ( でも or けど )
or ( '関係ない' )
or ( 違う or ちがう )
or ( 単なる and 'しか' and ( ない or 無い ) )
or ( 'では' and ( ない or 無い ) )
or ( 'どっちかと' or 'どっちかっ' )
or さ
Coverage & correlation of DMCs
DMC for Counterargument
Function
DMC for Confirmation
Coverage
r
p
Coverage
r
p
Counterargument
89/ 125
(71.2%)
.74
.00
6 / 125
(4.8%)
.03
.83
Doubt
18 / 60
(30.0%)
.18
.26
5 / 60
(8.33%)
.44
.00
Interpretation
27 / 119
(22.69%)
.16
.31
22 / 119
(18.49%)
.27
.08
Confirmation
14 / 178
(7.9%)
.32
.04
91 / 178
(51.1%)
.53
.00
Explanation
108 / 225
(48.0%)
.84
.00
24 / 225
(10.7%)
.02
.91
Preliminary validation of DMCs
 Correlations with self-rated conversational
behaviors during discussion ( 7-point scale ).
Counter-argument
DMC
Manual
.52
.52
How often did you
argue against
(p >.001) (p >.001)
other members?
How often were
you challenged by
other members?
.30
( .05)
.32
( .03)
Confirmation
DMC
.22
( .17)
Manual
.23
( .14)
.001
( .99)
.09
( .57)
Conclusions
Accuracy of DMC is not perfect, but
enough.
Not enough for one-to-one precise
matching.
Enough for discovering individual
differences among people: Who is more
likely to generate targeted utterance?
DMC is useful for discourse analysis.
Further Task
Construct DMCs for other conversational
functions.
Validate with other similar corpus.
Utilize contextual information.
Classify some meaning words.
Utilize other techniques (hopefully).
 Interactive Evolutionary Computation (IEC) for
automated exploration of words and phrases.
Thank you
[email protected]
ダウンロード

slides, ppt