
Welcome to the DFRWS APAC 2022 Forensic Rodeo. This rodeo will have you exploring a real Apple Health database making discoveries about the vast variety of information that can be found within and interpreting its meanings.
There are two sqlite database that store information related to the Apple Health application on Apple devices, healthdb_secure.sqlite and healthdb.sqlite. For this rodeo, you will mainly consider the healthdb_secure.sqlite as this database contains the majority of the health related information.
The rodeo consists of a series of beginner questions to serve as an introduction for users to familiarise themselves with the database, and advanced questions for users to attempt more interesting and thoughtful interpretations of the health data present. The aim of this page will be to provide a guided tutorial to describe the health database and to lead you through the rodeos introduction level questions.
There are some useful tools that may help you in this rodeo:
- We use DB Browser for SQLite to navigate the database and write queries.
- Excel or Matlab for data analysis.
- Apple stores its timestamps as Cocoa Core time. See Cocoa Core Data Timestamp Converter for help in converting to a readable format.
- This SQLite Query Tutorial page may be helpful if you get stuck on SQLite Queries.
Tutorial
Tables
The sqlite database consists of various tables that store different kinds of health information. Using DB Browser for SQLite or equivalent alternative, you can navigate these tables in a GUI.
By selecting "Browse Data" and then the drop down menu to select a table, you can view that tables contents.
Schema and Queries
Sometimes not all data can fit into a single table and must be spread across multiple tables. Tables that contain shared information are linked by what is called a "Schema" which is typically a common column across both tables. For example, consider the samples table and the quantity_samples table: Both of these tables contain the column "data_id".
Tables in the database can be joined together with a SQLite query if they share a common "schema" or column. The image below shows the columns for both the samples and quantity samples tables. Each of these tables can then be 'joined' through the common "data_id" column. An example of the SQLite query is shown below.
Sometimes the columns across do not share a common name and the schema must be inspected in the table overview panel. Below is an example of how the metadata_keys table connects to the metadata_values table despite differently named columns:
The common SQLite query statements you can use during these challenges are:
- select - Choose which columns you would like displayed
- datetime - Convert the apple cocoa core time
- from - Choose which table the columns are from
- left outer join - Connect the common schema between tables
- where - Filter by certain values
- order by - Choose the order by which the resulting table or column is displayed. Can be ASC or DESC
- like - Filter by similar values or text
Activity Data Types
Within the samples table is the column "data_type" which details what kind of activity that data_id and timestamp is associated with. However, that table has those activity types listed as integers and not in plain text. To help users, we are providing a short sample list of those data_types to help with your exploration and participation of the workshop:
data_type | Activity |
5 | Heart Rate |
7 | Steps |
8 | Distance |
9 | Basal/Resting Energy |
10 | Active Energy |
12 | Flights Climbed |
79 | Workout |
Examples
The below image is a query example that connects the samples, quantity samples, objects and data_provenances tables together. It also illustrates the use of the 'select' statement to choose what columns to display, the 'datetime' function to convert from cocoa core time to UTC time, the 'left outer join' statements to connect the columns from the tables together, the 'where' clause to specify only when the quantity is not zero, and the 'order by' clause to order it from earliest start date to latest start date.
The following image is a query example that calculates the total number of steps (data type 7) using the "sum" function for all start dates that contain '2017' somewhere in its field.