JCU Logo

JOHN CABOT UNIVERSITY

COURSE CODE: "CS 212"
COURSE NAME: "Introduction to Data Science"
SEMESTER & YEAR: Spring 2025
SYLLABUS

INSTRUCTOR: Alice Fabbri
EMAIL: [email protected]
HOURS: TTH 3:00 PM 4:15 PM
TOTAL NO. OF CONTACT HOURS: 45
CREDITS: 3
PREREQUISITES: Prerequisites: CS 160, MA 100/101
OFFICE HOURS: by appointment

COURSE DESCRIPTION:
This course introduces students to the main concepts of data science. It combines statistical, ethics, computational learning theory, pattern recognition, and containerization to create and implement Machine Learning and Deep Learning models for classification and prediction. Such models may have a significant impact on society, as they can be used to automate procedures and extract relevant information from large amounts of data. Students will learn how to detect and correct implicit/explicit bias often found in A.I. and Machine Learning algorithms by assessing the quality and objectivity of training data. This is important to determining validity /veracity of information (such as found in social media) and in threat analysis (as in cybersecurity). The course includes a critique of the inherent biases of data science itself and their societal implications. The course uses project-based learning: students will be guided through the process of formulating and carrying out data science methodology with real-world data, with a focus on open, pre-existing secondary data. Topics covered include descriptive statistics, elementary probability theory, basics of linear algebra, ethics in emerging technology, nonparametric decision-making such as Euclidean distance, nearest neighbor, support vector machine, decision tree, and supervised and unsupervised learning techniques such as neural networks, kernel machines, convolutional networks.
SUMMARY OF COURSE CONTENT:

·      Python libraries for data science

·      Introduction to methodology for data science projects

·      Data cleaning and preparation

·      Exploratory data analysis and visualizations

·      Modeling: unsupervised, supervised, and deep learning

·      Performance assessment and model selection

·      Bias in ML

LEARNING OUTCOMES:

Students completing this course will be able to:

·      Analyze data using statistical methods and programming tools

·      Identify the appropriate machine learning task for a given problem

·      Apply machine learning techniques to solve problems

·      Communicate data-driven insights effectively

TEXTBOOK:
Book TitleAuthorPublisherISBN numberLibrary Call NumberCommentsFormatLocal BookstoreOnline Purchase
An Introduction to Statistical LearningGareth James, Daniela Witten, Trevor Hastie, Rob Tibshirani, Jonathan TaylorSpringer - available for download at https://www.statlearning.com978-3-031-38747-0  Ebook  
R for Data ScienceHadley Wickham, Mine Çetinkaya-Rundel, and Garrett GrolemundO'Reilly Media - available for download at https://r4ds.hadley.nz/978-1492097402  Ebook  
Data Science from Scratch: First Principles with PythonJoel GrusO'Reilly Media978-1492041139  Ebook  
REQUIRED RESERVED READING:
NONE

RECOMMENDED RESERVED READING:
NONE
GRADING POLICY
-ASSESSMENT METHODS:
AssignmentGuidelinesWeight
Attendance and participationAttendance and participation are fundamental, as students will be involved in practical work during lessons.10%
Home AssignmentsFour assignments based on reporting coding debugging and interpretation of results.20%
Final ProjectImplementation of a data science project, to verify the knowledge acquired by the students in the course. Students will be asked to present their project in class.20%
QuizzesThree in-class quizzes, each on the most recent topics.30%
Midterm examOne in-class assessment at the end of the first half of the course.20%

-ASSESSMENT CRITERIA:
AWork of this quality directly addresses the question or problem raised and provides a coherent argument displaying an extensive knowledge of relevant information or content. This type of work demonstrates the ability to critically evaluate concepts and theory and has an element of novelty and originality. There is clear evidence of a significant amount of reading beyond that required for the course.
BThis is highly competent level of performance and directly addresses the question or problem raised.There is a demonstration of some ability to critically evaluatetheory and concepts and relate them to practice. Discussions reflect the student’s own arguments and are not simply a repetition of standard lecture andreference material. The work does not suffer from any major errors or omissions and provides evidence of reading beyond the required assignments.
CThis is an acceptable level of performance and provides answers that are clear but limited, reflecting the information offered in the lectures and reference readings.
DThis level of performances demonstrates that the student lacks a coherent grasp of the material.Important information is omitted and irrelevant points included.In effect, the student has barely done enough to persuade the instructor that s/he should not fail.
FThis work fails to show any knowledge or understanding of the issues raised in the question. Most of the material in the answer is irrelevant.

-ATTENDANCE REQUIREMENTS:
ATTENDANCE REQUIREMENTS AND EXAMINATION POLICY
You cannot make-up a major exam (midterm or final) without the permission of the Dean’s Office. The Dean’s Office will grant such permission only when the absence was caused by a serious impediment, such as a documented illness, hospitalization or death in the immediate family (in which you must attend the funeral) or other situations of similar gravity. Absences due to other meaningful conflicts, such as job interviews, family celebrations, travel difficulties, student misunderstandings or personal convenience, will not be excused. Students who will be absent from a major exam must notify the Dean’s Office prior to that exam. Absences from class due to the observance of a religious holiday will normally be excused. Individual students who will have to miss class to observe a religious holiday should notify the instructor by the end of the Add/Drop period to make prior arrangements for making up any work that will be missed. The final exam period runs until ____________
ACADEMIC HONESTY
As stated in the university catalog, any student who commits an act of academic dishonesty will receive a failing grade on the work in which the dishonesty occurred. In addition, acts of academic dishonesty, irrespective of the weight of the assignment, may result in the student receiving a failing grade in the course. Instances of academic dishonesty will be reported to the Dean of Academic Affairs. A student who is reported twice for academic dishonesty is subject to summary dismissal from the University. In such a case, the Academic Council will then make a recommendation to the President, who will make the final decision.
STUDENTS WITH LEARNING OR OTHER DISABILITIES
John Cabot University does not discriminate on the basis of disability or handicap. Students with approved accommodations must inform their professors at the beginning of the term. Please see the website for the complete policy.

SCHEDULE

SessionSession FocusReading AssignmentOther AssignmentMeeting Place/Exam Dates
Week 1 and 2Introduction to data science projects and Python libraries for data science   
Weeks 3 to 5Exploratory data analysis. Data cleaning and preprocessing of data.   
Week 6Supervised learning- Regression: linear regression   
Weeks 7 to 9Supervised learning - Classification: logistic regression, decision trees, k-nearest neighbors.   
Weeks 9 to 11Deep Learning - Artificial Neural networks with various architectures   
Weeks 12 to 13Unsupervised learning: clustering   
Week 14ML bias