Presentation Title

A data science approach to identify crucial factors of predicting test performance in Program for International Student Assessment

Faculty Mentor

Chong Ho Yu

Start Date

23-11-2019 10:45 AM

End Date

23-11-2019 11:30 AM

Location

8

Session

poster 4

Type of Presentation

Poster

Subject Area

behavioral_social_sciences

Abstract

The test results of Programme for International Student Assessment (PISA) have been utilized for benchmarking and informing education practice since 2000. The objective of this study is to identify the most important predictors of test performance in PISA math and science tests among top-performers. However, these big data collected from nations and regions across the globe poise challenge to traditional statistics, which tends to report false claims when the sample size is large and is restrictive by parametric assumptions. To rectify the situation, in this study data science, including ensemble methods, machine learning, and data visualization, were employed to analyze 2015 PISA data (n = 78,488). This sample is a subset of the entire PISA data set. Specifically, only observations from the top 10 performing countries and regions were extracted. The US sample was also included in order to examine whether the US model is different from the top-performer models. It was found that out of 611 variables, which are under different categories (e.g. parent, household environment, teacher, school resources, technology, student…etc.), only fewer than 10 factors can predict the test outcome. Additionally, unlike their international peers, problem-solving plays a less significant role in academic performance among American students.

This document is currently not available here.

Share

COinS
 
Nov 23rd, 10:45 AM Nov 23rd, 11:30 AM

A data science approach to identify crucial factors of predicting test performance in Program for International Student Assessment

8

The test results of Programme for International Student Assessment (PISA) have been utilized for benchmarking and informing education practice since 2000. The objective of this study is to identify the most important predictors of test performance in PISA math and science tests among top-performers. However, these big data collected from nations and regions across the globe poise challenge to traditional statistics, which tends to report false claims when the sample size is large and is restrictive by parametric assumptions. To rectify the situation, in this study data science, including ensemble methods, machine learning, and data visualization, were employed to analyze 2015 PISA data (n = 78,488). This sample is a subset of the entire PISA data set. Specifically, only observations from the top 10 performing countries and regions were extracted. The US sample was also included in order to examine whether the US model is different from the top-performer models. It was found that out of 611 variables, which are under different categories (e.g. parent, household environment, teacher, school resources, technology, student…etc.), only fewer than 10 factors can predict the test outcome. Additionally, unlike their international peers, problem-solving plays a less significant role in academic performance among American students.