2011 Sample Data
The file includes five notes in a plain text format and the same five notes with annotation.
The input files ends with .txt. They include tokenized sentences. There is only one sentence per each line. Each token is separated by empty space.
The output file ends with .con.txt. It ONLY contains sentences from the input file that were annotated with an emotion. Each annotation uses the following tags:
- c="The original sentence ."
- sentence_number:start_token sentence_number:end_token
- ||e="concept_name"
Here's an example
INPUT FILE: 20080901735_0621.txt
John : I am going to tell you this at the last .
You and John and Mother are what I am thinking - I ca n't go on - my life is ruined .
I am ill and heart - broken .
Always I have felt alone and never more alone than now .
John .
Please God forgive me for all my wrong doing .
I am lost and frightened .
God help me ,
Bless my son and my mother .
OUTPUT FILE: 20080901735_0621.con.txt
c="You and John and Mother are what I am thinking - I can't go on - my life is ruined ." 2:0 2:21||e="hopelessness"
c="Always I have felt alone and never more alone than now ." 4:0 4:11||e="sorrow"
c="I am lost and frightened ." 7:0 7:5||e="fear"