Lecture 1: Introduction to Data Science

Narjes Mathlouthi

2025-06-23

Welcome to PSTAT5A!

Instructor

  • Narjes Mathlouthi (nmathlouthi@ucsb.edu)

Office Hours (Zoom):

  • Thursdays 11 AM–12 PM

Teaching Assistants

  • Summer Lee (sle@ucsb.edu)

  • Mingzhu He (mingzhuhe@ucsb.edu)

Course Resources

  • Canvas: Grades & Announcements
  • Gradescope: Quizzes & Labs
    • Entry code: WJ4XR7
  • Course Website: bit.ly/3Ga8CSK
    • All lecture slides, labs, and code will be posted here

Course Resources

Labs: Interactive, hands‐on Python computing sessions hosted on JupyterHub.
Access the lab environment at: https://pstat5a.lsit.ucsb.edu

Communication & Email

  • Priority: Bring non‐urgent questions to office hours or after lecture rather than emailing.
  • Email Subject: Always include [PSTAT 5A] to help us sort and reply efficiently.
  • Response Time: Please allow 24–48 hours for replies; avoid sending emails over weekends.

What is Data Science?

  • No single agreed-upon definition
  • A cross-disciplinary field:
    • Statistics: theory of modeling & randomness
    • Computer Science: computation & data handling

Why Theory Matters

  • Data today is huge computation alone isn’t enough
  • Theory guides how and why we apply tools
  • Employers need analysts who understand and apply

Path Forward: Course Outline

  1. Descriptive Statistics: Summarize & visualize data
  2. Probability: Random variables & distributions
  3. Inferential Statistics: Confidence intervals & hypothesis tests
  4. Regression: Modeling relationships
  5. Data Collection: Sampling & study design

Why Should I Care?

  • Data is everywhere, any field dealing with data needs these skills
  • Companies seek insightful analysts, not just code runners
  • This course equips you with both theory and practice

Let’s Get Started!

Ready to dive into Descriptive Statistics?