13: Study Notes

Last updated
Save as PDF

Page ID: 106434

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

Study notes to Chapter 1

Resources

The British National Corpus (BNC) is available for download free of charge from the Oxford Text Archive at http://ota.ox.ac.uk/desc/2554.
The Corpus of Contemporary American English (COCA) is commercially available from Mark Davies at Brigham-Young University, who also provides a free web interface at https://corpus.byu.edu/coca/.

Study notes to Chapter 2

Resources

The Lancaster-Oslo-Bergen Corpus of Modern English (LOB) is available free of charge from the Oxford Text Archive at http://purl.ox.ac.uk/ota/ 0167.
The British National Corpus, Baby edition (BNC-BABY) is available for download free of charge from the Oxford Text Archive at http://purl.ox.ac. uk/ota/2553.
The London-Lund Corpus of Spoken English is available free of charge from the Oxford Text Archive at http://purl.ox.ac.uk/ota/0168.
The Susanne Corpus is available with some restrictions from the Oxford Text Archive at http://purl.ox.ac.uk/ota/1708.
Parts of the Santa Barbara Corpus of Spoken American English (SBCSAE) are available for download and through a web interface at https: //ca.talkbank.org/access/SBCSAE.html.
The International Corpus of English, British Component is commercially available from http://ice-corpora.net/ice/; the components for some other varieties (Canada, East Africa, Hong Kong, India, Ireland, Jamaica, Phillipines, Singapore and USA) can be downloaded at that URL after written registration.
The Brown University Standard Corpus of Present-Day American English (BROWN), the Freiburg-Brown Corpus of American English (FROWN), The Freiburg–LOB Corpus of British English (FLOB) and the WELLINGTON corpus are available to institutions participating in the CLARIN project at http://clarino.uib.no/korpuskel/.
A version of the BROWN corpus can also be downloaded at http://www. nltk.org/nltk_data/, but note that this is not the original version, and some texts are partially missing.

Study notes to Chapter 3

Resources

The Switchboard Corpus is available free of charge after written registration from the Linguistic Data Consorium, see https://catalog.ldc.upenn. edu/LDC97S62.
WordNet is available for download free of charge from Princeton University at https://wordnet.princeton.edu.
Many major dictionaries of English are currently searchable online free of charge. The following are recommended and used in this book:
- Various Cambridge dictionaries, including the Cambridge Advanced Learners Dictionary (CALD), https://dictionary.cambridge.org 442
- The Collins Dictionary (fomerly Collins COBUILD Advanced Dictionary), https://www.collinsdictionary.com/
- Longman Dictionary of Contemporary English (LDCE), https://www. ldoceonline.com
- Merriam-Webster (MW), https://www.merriam-webster.com
- Various Oxford dictionaries, including the Oxford Advanced Learners Dictionary (OALD), https://www.oxfordlearnersdictionaries.com

Study notes to Chapter 4

Resources

The ICE-GB sample corpus is available at http://www.ucl.ac.uk/englishusage/pr...beta/index.htm.
The IMS Open Corpus Work Bench (CWB) is a available for download free of charge at http://cwb.sourceforge.net/, it can be installed under all unix-like operating systems (including Linux and Mac OS X).
The NoSketch Engine is available for download free of charge at https: //nlp.fi.muni.cz/trac/noske for Linux.
The Tree Tagger is available for download at http://www.cis.unimuenchen.de/~schmi...ls/TreeTagger/ for Linux, Mac OS X and Windows.

Study notes to Chapter 5

(see Study notes to Chapter 6)

Study notes to Chapter 6

Resources

A comprehensive and well-maintained statistical software package is R, available for download free of charge from https://www.r-project.org/ for Linux, Mac OS X, Windows.
Especially if you are using Linux or Windows, I also recommend you download R Studio (also free of charge), which provides an advanced user interface to R, https://www.rstudio.com/products/rstudio/.

Study notes to Chapter 7

If you want to learn more about association measures, Evert (2005) and the companion website at http://www.collocations.de/AM/ are very comprehensive and relatively accessible places to start. Stefanowitsch & Flach (2016) discuss corpusbased association measures in the context of psycholinguistics.

Study notes to Chapter 8

Resources

The Corpus of Late Modern English Texts (CLMET v3.1) is available for download free of charge at https://fedora.clarin-d.uni-saarland...met/clmet.html.

Study notes to Chapter 9

Study notes to Chapter 10

Resources

The Corpus of Historical American English (COHA) is commercially available from Mark Davies at Brigham-Young University, who also provides a free web interface at https://corpus.byu.edu/coha/.
The \(n\)-gram data from the Google Books archive is available for download free of charge at http://storage.googleapis.com/books/ngrams/books/ datasetsv2.html (note that the files are extremely large).

Search

Text Color

Text Size

Margin Size

Font Type

Study notes to Chapter 1

Resources

Further reading

Study notes to Chapter 2

Resources

Further reading

Study notes to Chapter 3

Resources

Further reading

Study notes to Chapter 4

Resources

Further reading

Study notes to Chapter 5

Study notes to Chapter 6

Resources

Further reading

Study notes to Chapter 7

Study notes to Chapter 8

Resources

Further reading

Study notes to Chapter 9

Further reading

Study notes to Chapter 10

Resources

Further reading

Study notes to Chapter 11

Further reading