Data is becoming the most valuable digital asset, and getting data management right is as critical an issue for universities as other businesses. For Unizin, this is non-negotiable: Data belongs to the universities and needs to be accessible and standardized to enable institutions to use that information as they see fit. A common data model and architecture can accelerate management of this data for all of higher education, while allowing universities to focus on innovating above the data architecture.
Unizin is continually working with its Members to understand the current state of data on their campuses and assess local investments. We’ve developed a data architecture that addresses common challenges and inefficiencies pervasive in higher education.
Our first step is to become our Members’ data launderer by building the Unizin Data Platform built atop the Unizin Common Data Model. This step is crucial because data is not currently standardized across applications.
Data is generated by applications to serve their own needs. It can either permanently reside in that application or be manually moved to other places. The latter tends to be accomplished through a variety of reports that can be manipulated to focus on specific information. This data can then be impacted by the user, who may change the meaning or merge it with other data. The conclusions reached from any data often require explanation, if not justification, because a common data language doesn’t exist. All in all, most large enterprise data often exists in ill-managed and/or ill-defined states.
As James Hilton and Brad Wheeler wrote in 2014: “The existence of data exhaust offers great promise in fueling a more data-driven approach to understanding and improving teaching and learning. But it also creates a high-stakes game in which the platform becomes an essential part of the activity. Therein lies the rub; because those platforms are almost always sticky, getting content and data out of them is very difficult.”
Anyone who has moved from one LMS to another knows that the switching costs are punishingly high. More importantly, anyone who has tried to reuse content or data from past systems has discovered that the export feature is often an afterthought—barely functional and highly impoverished.
Organizations spend a lot of their resources finding, defining, storing, and managing the data to help overcome these challenges. A lot of smart, talented people spend much of their time trying to extract data from applications. One of the reasons their work can be so time-consuming is that applications often name the same datum differently. So where one app may assign a field the title “first name,” another might call that same data point “First_Last.” In a relatively mature organization, a savvy database administrator has probably done this work for users in the enterprise data warehouse, but that knowledge is still not necessarily understood or shared by everyone within the organization.
Universities also spend a lot of money to store this data (often with many redundancies for various use cases), protect it (with input from institutional data stewards’ and owners’ instructions), and deliver it effectively and efficiently.
As Hilton wrote: “By moving to a hosted set of shared services built on open standards, Unizin allows higher education institutions to concentrate on the what and the how of teaching.” Unizin seeks to eliminate unnecessary redundancies by working collaboratively to build common gauge rails for data management. There are two initial efforts to establish commonality of the data architecture across the Unizin Consortium: the data model and the reference architecture.
The Unizin Common Data Model (UCDM) is the Unizin standard for representing data from across the teaching and learning ecosystem. It provides a common language and model for data that comes from a variety of tools and systems. This uniformity is essential for researchers, faculty, application developers, and other staff studying the data landscape. This model aligns with and extends the Common Education Data Standards (CEDS) and aligns with event standards, such as Caliper and xAPI.
The Unizin Data Platform (UDP) is a secure data enclave built for Member institutions to house sensitive student data. Access to an institution’s data remains wholly under control of the institution. Designed as a cloud-based solution, the UDP provides a place to safely store a wide variety of student-related data and a growing set of integration and analytic services. The UDP offers data products and services for learning analytics, application development, research, and business intelligence at the institutional level. The UDP collects and standardizes data from a variety of sources using the UCDM and gives researchers and analysts access to data marts, real-time event processing, and APIs.
The UDP will streamline efforts to ingest, store, process, and deliver data elements from our Member institutions. The data will be focused on teaching and learning and the measurable or discoverable factors that may impact that information. This consistent data model will enable universities to work together, developing researched-based approaches for improving teaching and learning outcomes. Three central benefits arise from this shared data architecture:
- Leveraging economies of scale in a secure data enclave governed by the Member institutions.
- Evolving common data integration standards and practices from multiple data sources tied to student learning outcomes (e.g. SIS, LMS, admission data, tool data) in an open way rather than with black box proprietary solutions.
- Participating in the creation of the world’s largest learning laboratory at a pace that matches institutional desire and readiness. Many questions cannot be answered within the context of a single institution. The UDP enables institutions to take a scholarly and practical approach to critical questions around student performance.
The data architecture developed by Unizin engineers, programmers, and data scientists for addressing the challenges detailed above takes into account innumerable challenges and variations and will save countless man-hours. The end result will be a treasure trove of information about student experiences that will allow researchers to improve our understanding of the learning process and develop teaching practices that best support students.