|
|
JOHN CABOT UNIVERSITY
COURSE CODE: "CS 212"
COURSE NAME: "Introduction to Data Science"
SEMESTER & YEAR:
Spring 2025
|
SYLLABUS
INSTRUCTOR:
Alice Fabbri
EMAIL: [email protected]
HOURS:
TTH 3:00 PM 4:15 PM
TOTAL NO. OF CONTACT HOURS:
45
CREDITS:
3
PREREQUISITES:
Prerequisites: CS 160, MA 100/101
OFFICE HOURS:
by appointment
|
|
COURSE DESCRIPTION:
This course introduces students to the main concepts of data science. It combines statistical, ethics, computational learning theory, pattern recognition, and containerization to create and implement Machine Learning and Deep Learning models for classification and prediction. Such models may have a significant impact on society, as they can be used to automate procedures and extract relevant information from large amounts of data. Students will learn how to detect and correct implicit/explicit bias often found in A.I. and Machine Learning algorithms by assessing the quality and objectivity of training data. This is important to determining validity /veracity of information (such as found in social media) and in threat analysis (as in cybersecurity). The course includes a critique of the inherent biases of data science itself and their societal implications. The course uses project-based learning: students will be guided through the process of formulating and carrying out data science methodology with real-world data, with a focus on open, pre-existing secondary data. Topics covered include descriptive statistics, elementary probability theory, basics of linear algebra, ethics in emerging technology, nonparametric decision-making such as Euclidean distance, nearest neighbor, support vector machine, decision tree, and supervised and unsupervised learning techniques such as neural networks, kernel machines, convolutional networks.
|
SUMMARY OF COURSE CONTENT:
· Python libraries for data science
· Introduction to methodology for data science projects
· Data cleaning and preparation
· Exploratory data analysis and visualizations
· Modeling: unsupervised, supervised, and deep learning
· Performance assessment and model selection
· Bias in ML
|
LEARNING OUTCOMES:
Students completing this course will be able to:
· Analyze data using statistical methods and programming tools
· Identify the appropriate machine learning task for a given problem
· Apply machine learning techniques to solve problems
· Communicate data-driven insights effectively
|
TEXTBOOK:
Book Title | Author | Publisher | ISBN number | Library Call Number | Comments | Format | Local Bookstore | Online Purchase |
An Introduction to Statistical Learning | Gareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani, Jonathan Taylor | Springer - available for download at https://www.statlearning.com | 978-3-031-38747-0 | | | Ebook | | |
R for Data Science | Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund | O'Reilly Media - available for download at https://r4ds.hadley.nz/ | 978-1492097402 | | | Ebook | | |
Data Science from Scratch: First Principles with Python | Joel Grus | O'Reilly Media | 978-1492041139 | | | Ebook | | |
|
REQUIRED RESERVED READING:
RECOMMENDED RESERVED READING:
|
GRADING POLICY
-ASSESSMENT METHODS:
Assignment | Guidelines | Weight |
Attendance and participation | Attendance and participation are fundamental, as students will be involved in practical work during lessons. | 10% |
Home Assignments | Four assignments based on reporting coding debugging and interpretation of results. | 20% |
Final Project | Implementation of a data science project, to verify the knowledge acquired by the students in the course. Students will be asked to present their project in class. | 20% |
Quizzes | Three in-class quizzes, each on the most recent topics. | 30% |
Midterm exam | One in-class assessment at the end of the first half of the course. | 20% |
-ASSESSMENT CRITERIA:
AWork of this quality directly addresses the question or problem raised and provides a coherent argument displaying an extensive knowledge of relevant information or content. This type of work demonstrates the ability to critically evaluate concepts and theory and has an element of novelty and originality. There is clear evidence of a significant amount of reading beyond that required for the course. BThis is highly competent level of performance and directly addresses the question or problem raised.There is a demonstration of some ability to critically evaluatetheory and concepts and relate them to practice. Discussions reflect the student’s own arguments and are not simply a repetition of standard lecture andreference material. The work does not suffer from any major errors or omissions and provides evidence of reading beyond the required assignments. CThis is an acceptable level of performance and provides answers that are clear but limited, reflecting the information offered in the lectures and reference readings. DThis level of performances demonstrates that the student lacks a coherent grasp of the material.Important information is omitted and irrelevant points included.In effect, the student has barely done enough to persuade the instructor that s/he should not fail. FThis work fails to show any knowledge or understanding of the issues raised in the question. Most of the material in the answer is irrelevant.
-ATTENDANCE REQUIREMENTS:
ATTENDANCE REQUIREMENTS AND EXAMINATION POLICY
You cannot make-up a major exam (midterm or final) without the permission of the Dean’s Office. The Dean’s Office will grant such permission only when the absence was caused by a serious impediment, such as a documented illness, hospitalization or death in the immediate family (in which you must attend the funeral) or other situations of similar gravity. Absences due to other meaningful conflicts, such as job interviews, family celebrations, travel difficulties, student misunderstandings or personal convenience, will not be excused. Students who will be absent from a major exam must notify the Dean’s Office prior to that exam. Absences from class due to the observance of a religious holiday will normally be excused. Individual students who will have to miss class to observe a religious holiday should notify the instructor by the end of the Add/Drop period to make prior arrangements for making up any work that will be missed. The final exam period runs until ____________
|
|
ACADEMIC HONESTY
As stated in the university catalog, any student who commits an act of academic
dishonesty will receive a failing grade on the work in which the dishonesty occurred.
In addition, acts of academic dishonesty, irrespective of the weight of the assignment,
may result in the student receiving a failing grade in the course. Instances of
academic dishonesty will be reported to the Dean of Academic Affairs. A student
who is reported twice for academic dishonesty is subject to summary dismissal from
the University. In such a case, the Academic Council will then make a recommendation
to the President, who will make the final decision.
|
STUDENTS WITH LEARNING OR OTHER DISABILITIES
John Cabot University does not discriminate on the basis of disability or handicap.
Students with approved accommodations must inform their professors at the beginning
of the term. Please see the website for the complete policy.
|
|
SCHEDULE
|
|
|
Session | Session Focus | Reading Assignment | Other Assignment | Meeting Place/Exam Dates |
Week 1 and 2 | Introduction to data science projects and Python libraries for data science | | | |
Weeks 3 to 5 | Exploratory data analysis. Data cleaning and preprocessing of data. | | | |
Week 6 | Supervised learning- Regression: linear regression | | | |
Weeks 7 to 9 | Supervised learning - Classification: logistic regression, decision trees, k-nearest neighbors. | | | |
Weeks 9 to 11 | Deep Learning - Artificial Neural networks with various architectures | | | |
Weeks 12 to 13 | Unsupervised learning: clustering | | | |
Week 14 | ML bias | | | |
|