The leading
NLP group is at Stanford University, at
https://www-nlp.stanford.edu. Their software is summarized and available from
http://www-nlp.stanford.edu/software/index.shtml.
Here is an example sentence and the parse output from the Standford parser.
This article contains a discussion of the history of commercial and academic efforts to automate patent classifications.…
(ROOT
(S
(NP (DT This) (NN article))
(VP (VBZ contains)
(S
(NP
(NP (DT a) (NN discussion))
(PP (IN of)
(NP
(NP (DT the) (NN history))
(PP (IN of)
(NP
(UCP (JJ commercial)
(CC and)
(JJ academic))
(NNS efforts))))))
(VP (TO to)
(VP (VB automate)
(NP (NN patent) (NNS classifications))))))
(. .)))
det(article-2, This-1)
nsubj(contains-3, article-2)
root(ROOT-0, contains-3)
det(discussion-5, a-4)
nsubj(automate-15, discussion-5)
det(history-8, the-7)
prep_of(discussion-5, history-8)
amod(efforts-13, commercial-10)
conj_and(commercial-10, academic-12)
amod(efforts-13, academic-12)
prep_of(history-8, efforts-13)
aux(automate-15, to-14)
xcomp(contains-3, automate-15)
nn(classifications-17, patent-16)
dobj(automate-15, classifications-17)
(ROOT
(S
(NP (PRP It))
(ADVP (RB also))
(VP (VBZ suggests)
(NP
(NP (JJ new) (NNS approaches))
(PRN (-LRB- -LRB-)
(VP (VBG adding)
(NP (JJ additional) (JJ structured) (NN language))
(PP (TO to)
(NP (DT the) (NN text))))
(-RRB- -RRB-))
(SBAR
(WHNP (WDT that))
(S
(PRN (-LRB- -LRB-)
(S
(NP (PRP it))
(VP (VBZ asserts)))
(-RRB- -RRB-))
(VP (VBP lead)
(PP (TO to)
(NP
(ADJP (RB statistically) (JJ meaningful))
(NNS improvements))))))))
(. .)))
nsubj(suggests-3, It-1)
advmod(suggests-3, also-2)
root(ROOT-0, suggests-3)
amod(approaches-5, new-4)
dobj(suggests-3, approaches-5)
nsubj(lead-20, approaches-5)
dep(approaches-5, adding-7)
amod(language-10, additional-8)
amod(language-10, structured-9)
dobj(adding-7, language-10)
det(text-13, the-12)
prep_to(adding-7, text-13)
nsubj(asserts-18, it-17)
parataxis(lead-20, asserts-18)
rcmod(approaches-5, lead-20)
advmod(meaningful-23, statistically-22)
amod(improvements-24, meaningful-23)
prep_to(lead-20, improvements-24)
Using the
NLTK, here is the parts of NLTK and Stanford speech tagger (via NLTK API) output using the default settings, along with the differences between the two taggers.
NLTK tagger: (default settings)
0. This : DT = Determiner
1. article : NN = Noun, singular or mass
2. contains : VBZ = Verb, 3rd person singular present
3. a : DT = Determiner
4. discussion : NN = Noun, singular or mass
5. of : IN = Preposition or subordinating conjunction
6. the : DT = Determiner
7. history : NN = Noun, singular or mass
8. of : IN = Preposition or subordinating conjunction
9. commercial : JJ = Adjective
10. and : CC = Coordinating conjunction
11. academic : JJ = Adjective
12. efforts : NNS = Noun, plural
13. to : TO = to
14. automate : VB = Verb, base form
15. patent : NN = Noun, singular or mass
16. classifications. : NNP = Proper noun, singular
17. It : NNP = Proper noun, singular
18. also : RB = Adverb
19. suggests : VBZ = Verb, 3rd person singular present
20. new : JJ = Adjective
21. approaches : NNS = Noun, plural
22. ( : VBP = Verb, non-3rd person singular present
23. adding : VBG = Verb, gerund or present participle
24. additional : JJ = Adjective
25. structured : JJ = Adjective
26. language : NN = Noun, singular or mass
27. to : TO = to
28. the : DT = Determiner
29. text : NN = Noun, singular or mass
30. ) : : = Colon or ellipsis
31. that : IN = Preposition or subordinating conjunction
32. ( : CD = Cardinal number
33. it : PRP = Personal pronoun
34. asserts : VBZ = Verb, 3rd person singular present
35. ) : : = Colon or ellipsis
36. lead : NN = Noun, singular or mass
37. to : TO = to
38. statistically : RB = Adverb
39. meaningful : JJ = Adjective
40. improvements : NNS = Noun, plural
41. . : . = Termator
Stanford tagger: (default settings)
0. This : DT = Determiner
1. article : NN = Noun, singular or mass
2. contains : VBZ = Verb, 3rd person singular present
3. a : DT = Determiner
4. discussion : NN = Noun, singular or mass
5. of : IN = Preposition or subordinating conjunction
6. the : DT = Determiner
7. history : NN = Noun, singular or mass
8. of : IN = Preposition or subordinating conjunction
9. commercial : JJ = Adjective
10. and : CC = Coordinating conjunction
11. academic : JJ = Adjective
12. efforts : NNS = Noun, plural
13. to : TO = to
14. automate : VB = Verb, base form
15. patent : JJ = Adjective
16. classifications. : NN = Noun, singular or mass
17. It : PRP = Personal pronoun
18. also : RB = Adverb
19. suggests : VBZ = Verb, 3rd person singular present
20. new : JJ = Adjective
21. approaches : NNS = Noun, plural
22. ( : VBP = Verb, non-3rd person singular present
23. adding : VBG = Verb, gerund or present participle
24. additional : JJ = Adjective
25. structured : JJ = Adjective
26. language : NN = Noun, singular or mass
27. to : TO = to
28. the : DT = Determiner
29. text : NN = Noun, singular or mass
30. ) : NN = Noun, singular or mass
31. that : WDT = Wh-determiner
32. ( : VBZ = Verb, 3rd person singular present
33. it : PRP = Personal pronoun
34. asserts : VBZ = Verb, 3rd person singular present
35. ) : JJ = Adjective
36. lead : NN = Noun, singular or mass
37. to : TO = to
38. statistically : RB = Adverb
39. meaningful : JJ = Adjective
40. improvements : NNS = Noun, plural
41. . : . = Termator
Differences:
15. patent : NN = Noun, singular or mass
15. patent : JJ = Adjective
16. classifications. : NNP = Proper noun, singular
16. classifications. : NN = Noun, singular or mass
17. It : NNP = Proper noun, singular
17. It : PRP = Personal pronoun
30. ) : : = Colon or ellipsis
30. ) : NN = Noun, singular or mass
31. that : IN = Preposition or subordinating conjunction
31. that : WDT = Wh-determiner
32. ( : CD = Cardinal number
32. ( : VBZ = Verb, 3rd person singular present
35. ) : : = Colon or ellipsis
35. ) : JJ = Adjective