CS 2731: Intro to Natural Lanaguge Processing
Instructor: Malihe Alikhani
TA: Zhexiong Liu
Time and Place: Tuesday and Thursday 4:00 pm - 5:15 pm
Office hours: Thursday 3 to 4 pm & by appointment
Assignments will be submitted and graded using Canvas
Course questions and discussion: Slack
Tentative Schedule
Week 1: Introduction to NLP
Recommended Reading: SLP ch1
Week 2: Neural Networks for Text Classification
Recommended Reading: SLP ch 4, E ch 2
Week 3: Neural Networks for Text Classification
Recommended Reading: E ch 3, E ch 4, G ch 2
Week 5: Vector semantics and static word embeddings
Recommended Reading: E ch 14, SLP ch 6
Week 7: Sequence Labeling: POS & HMM, Viterbi & Forward Alg
Recommended Reading: SLP ch 8, E ch 7
Week 8: Context-Free Grammer
Recommended Reading: SLP ch 12, E ch 9
Week 9: Constituency Parsing and Dependency Parsing
Recommended Reading: SLP ch 14
Midterm exam
Week 10: Coreference Resolution
Recommended Reading: SLP ch 21, E ch 12
Week 12: Discourse Coherence
Recommended Reading: SLP ch 22
Week 11: Machine Translation
Recommended Reading: SLP ch 19
Week 13: Conversational Agents
Recommended Reading: E ch 18, G ch 17
Week 14: Ethics
Recommended Reading: TBD
Week 15: Student Presentations
Texts
-
[SLP3] Dan Jurafsky and James Martin, Speech and Language Processing (3nd ed. draft)
-
[E] Jacob Eisenstein, Natural Language Processing (2018)
-
[G] Yoav Goldberg, Neural Network Methods for Natural Language Processing (2017)
-
[PS] James Pustejovsky and Amber Stubbs, Natural Language Annotation for Machine Learning (2012)
Course Requirements and Grading
Programming assignment
Four programming assignments will be posted on Canvas.
Final research project
Final projects should ideally be done by teams of two students. Projects done by one or three students are possible on rare occasions with prior approval of the instructor. Each student is responsible for finding a partner for their final project. Feel free to post a message on Slack about your interests in order to find a partner with similar interests.
Submit a one-page project proposal by September 16 that briefly covers the following questions: 1) What is the problem or task that you propose to study? 2) What is the motivation for this problem/task, i.e., why is it important? 3) What is the novel algorithm/approach to this problem? 4) How is this method evaluated, i.e. what is the experimental methodology, what data is utilized, and what performance metrics are used? 5) What kind of ethical issues may be raised by this model? I will provide feedback on these proposals. Students are encouraged to discuss project proposals with me during office hours before submitting them. Groups are required to give a 10-minute presentation about their progress on the final project in the 8th week of the class. During the last week of class, each team will be required to give a 15-minute presentation on the current state of their project, and lead a 10-minute discussion of their project. Use the same format as the paired research paper presentations (see above), but you may have only preliminary actual results at that time to present. Submit the code to the GitHub repo for the class at midnight on Dec 10.
Final grade
The final grade will be computed as follows:
-
HW: 40%
-
HW1: 10%
-
HW2: 10%
-
HW3: 15%
-
HW4: 5%
-
-
Midterm: 20%
-
Final Project: 35%
-
Project proposal: 5%
-
Midterm report: 5%
-
Final presentation: 5%
-
Final Report: 20%
-
-
Participation: 5%
Academic integrity
All assignment submissions must be the sole work of each individual student. Students may not read or copy another student's solutions or share their own solutions with other students. Students may not review solutions from students who have taken the course in previous years. Submissions that are substantively similar will be considered cheating by all students involved, and as such, students must be mindful not to post their code publicly. The use of books and online resources is allowed, but must be credited in submissions, and material may not be copied verbatim. Any use of electronics or other resources during an examination will be considered cheating.
If you have any doubts about whether a particular action may be construed as cheating, ask the instructor for clarification before you do it. The instructor will make the final determination of what is considered cheating.
Cheating in this course will result in a grade of F for the course and may be subject to further disciplinary action.
Using an open-source codebase is accepted, but you must explicitly cite the source, especially following the owner's guideline if it exists. For any writing involved in the project, plagiarism is strictly prohibited. If you are unclear whether your work will be considered as plagiarism, ask the instructor before submitting or presenting the work
Students with disabilities
If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources and Services, 216 William Pitt Union, 412-648-7890 or 412-383-7355 (TTY) as early as possible in the term. DRS will verify your disability and determine reasonable accommodations for this course. More info at www.drs.pitt.edu.
Audio/video recordings
To ensure the free and open discussion of ideas, students may not record classroom lectures, discussions, and/or activities without the advance written permission of the instructor, and any such recording properly approved in advance can be used solely for the student's own private use. Since this is a seminar class and meetings are all online, students are required to use their cameras during the class.
Copyrighted materials
All material provided through this website is subject to copyright. This applies to class/recitation notes, slides, assignments, solutions, project descriptions, etc. You are allowed (and expected!) to use all the provided material for personal use. However, you are strictly prohibited from sharing the material with others in general and from posting the material on the Web or other file-sharing venues in particular.
Diversity and Inclusion
The University of Pittsburgh does not tolerate any form of discrimination, harassment, or retaliation based on disability, race, color, religion, national origin, ancestry, genetic information, marital status, familial status, sex, age, sexual orientation, veteran status or gender identity or other factors as stated in the University’s Title IX policy. The University is committed to taking prompt action to end a hostile environment that interferes with the University’s mission. For more information about policies, procedures, and practices, see:http://diversity.pitt.edu/affirmative-action/policies-procedures-and-practices.3I ask that everyone in the class strive to help ensure that other members of this class can learn in a supportive and respectful environment. If there are instances of the aforementioned issues, please contact the Title IX Coordinator, by calling 412-648-7860, or e-mailingtitleixcoordinator@pitt.edu. Reports can also be filed online:https://www.diversity.pitt.edu/make-report/report-form. You may also choose to report this to a faculty/staff member; they are required to communicate this to the University’s Office of Diversity and Inclusion. If you wish to maintain complete confidentiality, you may also contact the University Counseling Center (412-648-7930)
Gender Inclusive Language Statement (from Pitt GSWS)
Language is gender-inclusive and non-sexist when we use words that affirm and respect how people describe, express and experience their gender. Just as sexist language excludes women’s experiences, non-gender-inclusive language excludes the experiences of individuals whose identities may not fit the gender binary, and/or who may not identify with the sex they were assigned at birth. Identities including trans, intersex, and genderqueer reflect personal descriptions, expressions, and experiences. Gender-inclusive/non-sexist language acknowledges people of any gender (for example, first-year student versus freshman, chair versus chairman, humankind versus mankind, etc.). It also affirms non-binary gender identifications and recognizes the difference between biological sex and gender expression. Students may share their preferred pronouns and names, and these gender identities and gender expressions should be honored.