There’s a Data Mart for That (or soon will be)
Today, the average size of a Unizin member UDP tenant is approximately 11TB, and it’s not uncommon to see upwards of 5 million learning events stream into a member UDP tenant on a given day early in the fall semester. As we integrate more context and event data from tools such as Canvas, Top Hat, and Kaltura, the UDP grows larger and more complex.
Still, we can’t lose sight of the fact that creating the UDP was not merely an exercise in scale. It was an exercise in utility. The ultimate purpose of UDP is to be a decision intelligence engine for Unizin members, enabling new inquiry and delivering valuable insights that will shape the future of student success initiatives. To fulfill this purpose, we need to find ways to efficiently and effectively unlock the intelligence and insights contained within UDP data.
The task of turning 11TB of data into useful information would typically require professionals with significant SQL expertise to parse massive datasets and complementary staff armed with data visualization tools, dashboards, and custom web applications to deliver that data to various stakeholder groups in easily digestible formats.
Investing in these efforts can help answer some interesting and important questions in higher education:
- What is it worth, from a resource perspective, to increase graduation rates by 5%?
- To decrease the rate at which students enter academic probation by 3%?
- To increase the number of students who successfully complete large STEM gateway courses?
- What resources are required to ensure a viable journey to student success?
Still, with technical resources in high demand, we can simplify the process and lower the barrier of entry to the UDP to more effectively apply data and information to support student success inquiry and the resulting initiatives.
Data marts are the return on our standardization investment
Data marts are a subset of a data warehouse or database, based on structuring or modeling data around a set of proposed outcomes or a specific line of inquiry. When deployed, data marts can coalesce aggregated and calculated data fields in flat, easily accessible tables or sets of tables without the requisite SQL expertise.
What makes this approach not only possible, but extremely powerful for the consortium is the Unizin Common Data Model. The time and effort we invested in developing and deploying a standardized and unified data model means we can easily propagate data marts across the entire consortium and provide new information to a larger number of constituents at a lower technical threshold, while utilizing fewer specialized resources.
Consider the complexity of the SQL queries required to parse volumes of data across disparate sets, including all the context data describing students and courses, as well as all the pertinent event data generated by students throughout their learning – navigating to a page, submitting an assignment, or completing a reading assignment and associated quiz.
Now imagine multiplying that effort across a network of institutions where data is described, named, and cataloged differently.
Applying data marts against our unified and common data model allows us to work smarter, not harder. We can collaborate to define data parameters and then write complex SQL queries once which can be distributed and applied across individual member campuses and within member groups interested in pursuing collaborative research.
We have already written a data mart to answer the question “Does time spent in Canvas positively or negatively correlate with student outcomes?”. Data consumers representing various stakeholder groups on our member campuses can now easily and quickly integrate perpetually calculated and updated data fields that describe a student’s overall activity into new or existing applications, reports, or dashboards.
Using this model, we now offer a suite of data marts including:
- Interaction Session approximates an answer to a basic question: how much time do students spend in using the LMS and learning tools for their learning-related activities in a course?
- File Interaction keeps track of file interactions within a course offering.
- Last Activity includes information about the latest activities of students in a course offering.
- Course Status includes information regarding the status of a Course offering in a learning environment.
- LTI Tool Use keeps track of use of LTI tools in the learning environment.
- LMS Tool Use keeps track of interactions with specific tools found within the LMS.
- Taskforce Mart – Level 1 Aggregator collects student-level data about performance and activities in a course in real-time, including assignments, discussions, and learning environment activities.
Our Data Mart Task Force, which includes representatives from across the consortium, is collaborating to identify the next generation of data marts to continually add value to the consortium.
The data mart model is a first but important step to transforming the UDP from a data aggregation tool into the decision intelligence engine it was meant to be for the benefit of every member within the Unizin consortium. Together we have jump started the process to rapidly transform data into the knowledge and insights that will instigate change and will ultimately drive student success.