AID101 Fundamentals of Data Science

Referencing Curricula Print this page

Course Code Course Title Weekly Hours* ECTS Weekly Class Schedule
T P
AID101 Fundamentals of Data Science 3 2 6
Prerequisite None It is a prerequisite to

None

Lecturer Fahir Kanlic Office Hours / Room / Phone
E-mail fkanlic@ius.edu.ba
Assistant Assistant E-mail
Course Objectives The course will equip students with theoretical and practical knowledge, including technical skills related to data science as a rapidly growing field by using popular programming language. It will introduce students to the latest concepts, principles and tools of data science (including data types, data structures, data manipulation techniques), data modelling, data visualization, machine learning algorithms and techniques, etc. The course emphasizes a hands-on approach to learning data skills, offering a number of interactive exercises by using real-life datasets from a variety of disciplines with the aim of applying many techniques and concepts. By the end of the course, students will improve their theoretical and practical knowledge. Students will gain the skills to apply specific analytics tools and interpret solutions to many problems.
By the end of this course, students will be able to:
1. Understand the main steps and challenges of a data science project
2. Apply appropriate methods and tools to collect, clean, explore, and visualize data
3. Perform basic statistical analysis and hypothesis testing on data.
4. Implement and evaluate common machine learning algorithms for classification and regression
5. Communicate and present data science results effectively and responsibly
Textbook 1. Grus, J. 2019, Data Science from Scratch: First Principles with Python, 2nd edition, O'Reilly Media 2. McKinney, W. 2017, Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, 2nd edition, O’Reilly Media
Additional Literature
  • 1. Kalita, J.K., Bhattacharyya, D.K., Roy, S. 2023, Fundamentals of Data science – Theory and Practice, Academic Press.
Learning Outcomes After successful  completion of the course, the student will be able to:
  1. understand the main steps and challenges of a data science project
  2. apply appropriate methods and tools to collect, clean, explore, and visualize data
  3. perform basic statistical analysis and hypothesis testing on data
  4. implement and evaluate common machine learning algorithms for classification and regression
Teaching Methods Combination of lectures (theory and explaining the background of the topic) and practical exercises (practical work by programming and practicing by using the learned algorithms to a real-world dataset)
Teaching Method Delivery Face-to-face Teaching Method Delivery Notes
WEEK TOPIC REFERENCE
Week 1 Introduction to Data Science | Define data science and its applications - Explain the data science workflow - Identify the types and sources of data - Install and use Python and Jupyter Notebook
Week 2 Data Collection and Wrangling | Use Python libraries to read, write, and manipulate data - Perform data cleaning and preprocessing - Handle missing values and outliers - Apply web scraping and APIs to collect data from the web
Week 3 Data Exploration and Visualization | Use descriptive statistics to summarize data - Use Python libraries to create various types of plots - Explore the distribution, correlation, and relationship of data - Apply dimensionality reduction techniques to reduce data complexity
Week 4 Statistical Inference and Hypothesis Testing | - Understand the concepts of population, sample, parameter, and statistic - Apply sampling methods and calculate sampling errors - Perform hypothesis testing and confidence intervals - Interpret p-values and significance levels
Week 5 Linear Regression | - Understand the concept of linear regression and its assumptions - Implement simple and multiple linear regression using Python - Evaluate the performance and accuracy of linear regression models - Identify and handle multicollinearity, heteroscedasticity, and non-linearity
Week 6 Logistic Regression | - Understand the concept of logistic regression and its applications - Implement logistic regression using Python - Evaluate the performance and accuracy of logistic regression models - Use confusion matrix, ROC curve, and AUC to measure classification performance
Week 7 Classification Algorithms | - Understand the concept of classification and its applications - Implement k-nearest neighbors (KNN), decision trees, and random forests using Python - Compare the advantages and disadvantages of different classification algorithms - Tune hyperparameters and optimize classification models
Week 8 Clustering Algorithms | - Understand the concept of clustering and its applications - Implement k-means, hierarchical clustering, and DBSCAN using Python - Compare the advantages and disadvantages of different clustering algorithms - Evaluate clustering results using internal and external metrics
Week 9 MID-TERM
Week 10 Association Rule Mining | - Understand the concept of association rule mining and its applications - Implement Apriori algorithm using Python - Interpret association rules using support, confidence, lift, and leverage - Apply association rule mining to market basket analysis
Week 11 Text Mining | - Understand the concept of text mining and its applications - Perform text preprocessing using Python (tokenization, stemming, lemmatization) - Apply bag-of-words (BOW) and term frequency-inverse document frequency (TF-IDF) to represent text data - Implement sentiment analysis using Python
Week 12 Natural Language Processing (NLP) | - Understand the concept of natural language processing (NLP) and its applications - Apply regular expressions to extract information from text data - Implement named entity recognition (NER) using Python - Use word embedding to capture semantic meaning of words
Week 13 Neural Networks | - Understand the concept of neural networks and its applications - Explain the structure and components of a neural network (input layer, hidden layer, output layer) - Implement a simple neural network using Python (feedforward propagation, backpropagation) - Use activation functions, loss functions, optimization algorithms in neural networks
Week 14 Deep Learning |- Understand the concept of deep learning and its applications - Explain the difference between shallow and deep neural networks - Implement convolutional neural networks (CNNs) using Python for image recognition - Implement recurrent neural networks (RNNs) using Python for sequence modeling - Use TensorFlow and Keras to build and train deep learning models
Week 15 Data Science Ethics and Social Issues | - Understand the ethical and social issues of data science, such as privacy, fairness, accountability, and transparency - Identify and address potential biases and harms in data collection, analysis, and use - Apply ethical principles and frameworks to data science projects - Communicate and present data science results in a responsible and trustworthy manner
Assessment Methods and Criteria Evaluation Tool Quantity Weight Alignment with LOs
Final Exam 1 30
Semester Evaluation Components
Midterm 1 30
Quizzes 2 10
Term project and presentation 1 10
Lab assignments 10 20
***     ECTS Credit Calculation     ***
 Activity Hours Weeks Student Workload Hours Activity Hours Weeks Student Workload Hours
Lecture hours 3 14 42 Assignments 2 10 20
Active labs 2 14 28 Home study 1 14 14
In-term exam study 11 1 11 Final exam study 11 1 11
Term project/presentation 2 12 24
        Total Workload Hours = 150
*T= Teaching, P= Practice ECTS Credit = 6
Course Academic Quality Assurance: Semester Student Survey Last Update Date: 08/04/2024

Print this page