Analyzing FHIR Data in a Tabular Format With Python

Learning objectives

Understand the high-level approaches for converting FHIR-formatted data into tabular data for analysis in Python.
Learn how to request data from a FHIR server and creating tidy tabular data tables using the FHIR-PYrate library.

Introduction

For the best learning experience, run this tutorial interactively, via one of the environment setup options. Use the above button depending on your chosen setup option.

Data analysis approaches in Python often use Pandas DataFrames to store tabular data. There are two primary approaches to loading FHIR-formatted data into Pandas DataFrames:

Writing Python code to manually convert FHIR instances in JSON format into DataFrames.

This does not require any special skills beyond data manipulation in Python, but in practice can be laborious (especially with large number of data elements) and prone to bugs.
Using a purpose-built library like FHIR-PYrate to automatically convert FHIR instances into DataFrames.

It is recommended to try this approach first, and only fall back to (1) if needed.

In this tutorial, we’re using a FHIR server located at http://localhost:8080/fhir but any FHIR server loaded with appropriate data can be used. For instructions on setting up your own test server, see Standing up a FHIR Testing Server.

Learning Paths

This tutorial offers three difficulty levels to accommodate different experience levels:

Difficulty Levels
Level	Focus Areas	Recommended For
Beginner	Basic FHIR connection, Simple data retrieval	Those new to FHIR or Python data analysis
Intermediate	Column selection, FHIRPath usage	Those familiar with basic DataFrame operations
Advanced	Multiple resources, Complex searching	Experienced data analysts working with FHIR

You can follow the tutorial sequentially or jump to the section that matches your experience level.

Retrieving FHIR data (Beginner Level)

In this section, you’ll learn how to:

Connect to a FHIR server
Retrieve basic patient data
Convert FHIR resources to a Pandas DataFrame

Tip 1: Beginner Level Validation Checklist

Your setup is successful if you can confirm:

Dependencies install without errors
FHIR server connection established (status 200)
DataFrame displays patient data

Common issues:

Package installation errors: Check Python version (3.8+ required)
Empty DataFrame: Check search parameters

Check the server connection.

# Load dependency
import requests, os

fhir_server = os.environ.get('FHIR_SERVER')
print(f"Using FHIR server: {fhir_server}")

# Check if the server is running and connection is successful
response = requests.get(f"{fhir_server}/metadata")

print(f"Server status: {response.status_code}")

Using FHIR server: http://localhost:8080/fhir
Server status: 200

Understanding the FHIR Metadata Endpoint

The metadata endpoint (/metadata) is a special FHIR endpoint that returns the server’s capability statement - a structured document that describes what the server can do. When we query this endpoint:

We’re checking if the server is responsive (status code 200)
We’re verifying it’s a valid FHIR server
The response contains details about supported resources, operations, and search parameters

This is a lightweight way to validate connectivity before attempting more complex queries.

If connection to the server is successful (code 200), proceed with the next code block to pull data from the server.

# Load dependencies
from fhir_pyrate import Pirate
import pandas as pd

# Instantiate a Pirate object using the FHIR-PYrate library to query the server
search = Pirate(
    auth=None,  # Pass the configured session
    base_url=fhir_server,
    print_request_url=True,
)

# Use the whimsically named `steal_bundles()` method
# to instantiate a search interaction
# For more information, see https://github.com/UMEssen/FHIR-PYrate/#pirate
bundles = search.steal_bundles(
    resource_type="Patient",
    request_params={
        "_count": 10,  # Get 10 instances per page
    },
    num_pages=1,  # Get 1 page (so a total of 10 instances)
)

# Execute the search and convert to a Pandas DataFrame
df = search.bundles_to_dataframe(bundles)

df.head(5)

http://localhost:8080/fhir/Patient?_count=10

Query (Patient):   0%|          | 0/1 [00:00<?, ?it/s]Query (Patient): 100%|██████████| 1/1 [00:00<00:00, 751.53it/s]

	resourceType	id	meta_versionId	meta_lastUpdated	meta_source	meta_profile_0	text_status	text_div	extension_0_url	extension_0_extension_0_url	...	maritalStatus_coding_0_system	maritalStatus_coding_0_code	maritalStatus_coding_0_display	maritalStatus_text	multipleBirthBoolean	communication_0_language_coding_0_system	communication_0_language_coding_0_code	communication_0_language_coding_0_display	communication_0_language_text	address_0_postalCode
0	Patient	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	1	2025-05-22T21:40:48.562+00:00	#PqX7rAkKbpo7HbGt	http://hl7.org/fhir/us/core/StructureDefinitio...	generated	<div xmlns="http://www.w3.org/1999/xhtml">Gene...	http://hl7.org/fhir/us/core/StructureDefinitio...	ombCategory	...	http://terminology.hl7.org/CodeSystem/v3-Marit...	S	S	S	False	urn:ietf:bcp:47	fr-FR	French (France)	French (France)	NaN
1	Patient	39533e4a-f6f2-a144-ab37-6500460250dc	1	2025-05-22T21:40:55.463+00:00	#QqLNMq0sw9xVSFB9	http://hl7.org/fhir/us/core/StructureDefinitio...	generated	<div xmlns="http://www.w3.org/1999/xhtml">Gene...	http://hl7.org/fhir/us/core/StructureDefinitio...	ombCategory	...	http://terminology.hl7.org/CodeSystem/v3-Marit...	M	M	M	False	urn:ietf:bcp:47	en-US	English	English	NaN
2	Patient	68c3ae0b-e298-62b7-5d3a-7936fd998fe0	1	2025-05-22T21:41:12.944+00:00	#AttNtCcXbkynDqeJ	http://hl7.org/fhir/us/core/StructureDefinitio...	generated	<div xmlns="http://www.w3.org/1999/xhtml">Gene...	http://hl7.org/fhir/us/core/StructureDefinitio...	ombCategory	...	http://terminology.hl7.org/CodeSystem/v3-Marit...	M	M	M	False	urn:ietf:bcp:47	en-US	English	English	NaN
3	Patient	df860bc2-1943-237f-7445-ed960a1ef069	1	2025-05-22T21:41:15.068+00:00	#mtSuzIMdVdb3ZulO	http://hl7.org/fhir/us/core/StructureDefinitio...	generated	<div xmlns="http://www.w3.org/1999/xhtml">Gene...	http://hl7.org/fhir/us/core/StructureDefinitio...	ombCategory	...	http://terminology.hl7.org/CodeSystem/v3-Marit...	S	S	S	False	urn:ietf:bcp:47	en-US	English	English	NaN
4	Patient	7d9ba758-f0b8-3fc3-befa-f0e8e8fb6935	1	2025-05-22T21:41:30.624+00:00	#T7HXFCOWRrKiVdne	http://hl7.org/fhir/us/core/StructureDefinitio...	generated	<div xmlns="http://www.w3.org/1999/xhtml">Gene...	http://hl7.org/fhir/us/core/StructureDefinitio...	ombCategory	...	http://terminology.hl7.org/CodeSystem/v3-Marit...	M	M	M	False	urn:ietf:bcp:47	en-US	English	English	01915

5 rows × 89 columns

Tip 2: Understanding Your Output

If successful, you should see a DataFrame with multiple columns containing patient information. Common columns include:

identifier_0_value: Patient ID
gender: Patient gender
birthDate: Patient date of birth
name_0_family: Patient family name

If you don’t see this structure, review the validation checklist in Tip 1.

It is easier to see the contents of this DataFrame by printing out its first row vertically:

# Print the first row of the DataFrame vertically for easier reading.
pd.set_option("display.max_rows", 100)  # Show all rows
df.head(1).T

	0
resourceType	Patient
id	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b
meta_versionId	1
meta_lastUpdated	2025-05-22T21:40:48.562+00:00
meta_source	#PqX7rAkKbpo7HbGt
meta_profile_0	http://hl7.org/fhir/us/core/StructureDefinitio...
text_status	generated
text_div	<div xmlns="http://www.w3.org/1999/xhtml">Gene...
extension_0_url	http://hl7.org/fhir/us/core/StructureDefinitio...
extension_0_extension_0_url	ombCategory
extension_0_extension_0_valueCoding_system	urn:oid:2.16.840.1.113883.6.238
extension_0_extension_0_valueCoding_code	2106-3
extension_0_extension_0_valueCoding_display	White
extension_0_extension_1_url	text
extension_0_extension_1_valueString	White
extension_1_url	http://hl7.org/fhir/us/core/StructureDefinitio...
extension_1_extension_0_url	ombCategory
extension_1_extension_0_valueCoding_system	urn:oid:2.16.840.1.113883.6.238
extension_1_extension_0_valueCoding_code	2186-5
extension_1_extension_0_valueCoding_display	Non Hispanic or Latino
extension_1_extension_1_url	text
extension_1_extension_1_valueString	Non Hispanic or Latino
extension_2_url	http://hl7.org/fhir/StructureDefinition/patien...
extension_2_valueString	Dallas143 Hirthe744
extension_3_url	http://hl7.org/fhir/us/core/StructureDefinitio...
extension_3_valueCode	M
extension_4_url	http://hl7.org/fhir/StructureDefinition/patien...
extension_4_valueAddress_city	Nice
extension_4_valueAddress_state	Provence-Alpes-Cote d'Azur
extension_4_valueAddress_country	FR
extension_5_url	http://synthetichealth.github.io/synthea/disab...
extension_5_valueDecimal	5.530027
extension_6_url	http://synthetichealth.github.io/synthea/quali...
extension_6_valueDecimal	74.469973
identifier_0_system	https://github.com/synthetichealth/synthea
identifier_0_value	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b
identifier_1_type_coding_0_system	http://terminology.hl7.org/CodeSystem/v2-0203
identifier_1_type_coding_0_code	MR
identifier_1_type_coding_0_display	Medical Record Number
identifier_1_type_text	Medical Record Number
identifier_1_system	http://hospital.smarthealthit.org
identifier_1_value	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b
identifier_2_type_coding_0_system	http://terminology.hl7.org/CodeSystem/v2-0203
identifier_2_type_coding_0_code	SS
identifier_2_type_coding_0_display	Social Security Number
identifier_2_type_text	Social Security Number
identifier_2_system	http://hl7.org/fhir/sid/us-ssn
identifier_2_value	999-66-1459
identifier_3_type_coding_0_system	http://terminology.hl7.org/CodeSystem/v2-0203
identifier_3_type_coding_0_code	DL
identifier_3_type_coding_0_display	Driver's License
identifier_3_type_text	Driver's License
identifier_3_system	urn:oid:2.16.840.1.113883.4.3.25
identifier_3_value	S99915912
identifier_4_type_coding_0_system	http://terminology.hl7.org/CodeSystem/v2-0203
identifier_4_type_coding_0_code	PPN
identifier_4_type_coding_0_display	Passport Number
identifier_4_type_text	Passport Number
identifier_4_system	http://standardhealthrecord.org/fhir/Structure...
identifier_4_value	X42393955X
name_0_use	official
name_0_family	Weber641
name_0_given_0	Louie190
name_0_prefix_0	Mr.
telecom_0_system	phone
telecom_0_value	555-814-1743
telecom_0_use	home
gender	male
birthDate	1922-11-05
deceasedDateTime	2003-07-11T12:20:54-04:00
address_0_extension_0_url	http://hl7.org/fhir/StructureDefinition/geoloc...
address_0_extension_0_extension_0_url	latitude
address_0_extension_0_extension_0_valueDecimal	42.088105
address_0_extension_0_extension_1_url	longitude
address_0_extension_0_extension_1_valueDecimal	-70.678992
address_0_line_0	121 Durgan Boulevard Unit 90
address_0_city	Green Harbor-Cedar Crest
address_0_state	MA
address_0_country	US
maritalStatus_coding_0_system	http://terminology.hl7.org/CodeSystem/v3-Marit...
maritalStatus_coding_0_code	S
maritalStatus_coding_0_display	S
maritalStatus_text	S
multipleBirthBoolean	False
communication_0_language_coding_0_system	urn:ietf:bcp:47
communication_0_language_coding_0_code	fr-FR
communication_0_language_coding_0_display	French (France)
communication_0_language_text	French (France)
address_0_postalCode	NaN

If you look at the output above, you can see FHIR-PYrate collapsed the hierarchical FHIR data structure into DataFrame columns. FHIR-PYrate does this by taking an element from the FHIR-formatted data like Patient.identifier[0].value and converting to an underscore-delimited column name like identifier_0_value. (Note that Patient.identifier has multiple values in the FHIR data, so there are multiple identifier_N_... columns in the DataFrame.)

FHIR to DataFrame Mapping Example

FHIR JSON Structure	DataFrame Column Name
`{"identifier": [{"value": "123"}]}`	`identifier_0_value`
`{"name": [{"family": "Smith"}]}`	`name_0_family`
`{"telecom": [{"system": "phone", "value": "555-1234"}]}`	`telecom_0_system`, `telecom_0_value`

This mapping allows you to access nested FHIR data using familiar DataFrame operations.

Selecting specific columns (Intermediate Level)

Tip 3: Intermediate Skills Check

Before proceeding with this section, ensure you can:

Understand FHIR resource structure
Work with basic DataFrame operations
Read FHIRPath syntax

Practice Exercise:

Try modifying the previous code to only retrieve patient names and birth dates.

Usually not every single value from a FHIR instance is needed for analysis. There are two ways to get a more concise DataFrame:

Use the approach above to load all elements into a DataFrame, remove the unneeded columns, and rename the remaining columns as needed. The process_function capability in FHIR-PYrate allows you to integrate this approach into the bundles_to_dataframe() method call.
Use FHIRPath to select specific elements and map them onto column names.

The second approach is typically more concise. For example, to generate a DataFrame like this…

id	gender	date_of_birth	marital_status
…	…	…	…

…you could use the following code:

# Instantiate and perform the FHIR search interaction in a single function call
df = search.steal_bundles_to_dataframe(
    resource_type="Patient",
    request_params={
        "_count": 10,  # Get 10 instances per page
    },
    num_pages=1,  # Get 1 page (so a total of 10 instances)
    fhir_paths=[
        ("id", "identifier[0].value"),
        ("gender", "gender"),
        ("date_of_birth", "birthDate"),
        ("marital_status", "maritalStatus.coding[0].code"),
    ],
)
df

http://localhost:8080/fhir/Patient?_count=10

Query & Build DF (Patient):   0%|          | 0/1 [00:00<?, ?it/s]Query & Build DF (Patient): 100%|██████████| 1/1 [00:00<00:00, 265.65it/s]

	id	gender	date_of_birth	marital_status
0	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	male	1922-11-05	S
1	39533e4a-f6f2-a144-ab37-6500460250dc	male	1919-01-28	M
2	68c3ae0b-e298-62b7-5d3a-7936fd998fe0	male	1919-01-28	M
3	df860bc2-1943-237f-7445-ed960a1ef069	male	1921-11-21	S
4	7d9ba758-f0b8-3fc3-befa-f0e8e8fb6935	male	1921-07-29	M

Tip 4: Validation: Column Selection

Your code is working correctly if:

DataFrame contains only the specified columns
Column names match your defined mappings
Data types are appropriate (e.g., dates for birthDate)
No errors in FHIRPath expressions

While FHIRPath can be quite complex, its use in FHIR-PYrate is often straightforward. Nested elements are separated with ., and elements with multiple sub-values are identified by [N] where N is an integer starting at 0.

Examples illustrating the relationship between FHIRPath and DataFrame column names:

When using FHIRPath, maritalStatus.coding[0].code refers to the same data that appears in the column named maritalStatus_coding_0_code in the full DataFrame output. The [0] indicates it’s the first coding in the maritalStatus array.
Similarly, in the DataFrame output we saw a column identifier_3_type_coding_0_system which corresponds to the FHIRPath expression identifier[3].type.coding[0].system. This refers to the system identifier for the type of the fourth identifier (arrays are zero-indexed).

The element paths can typically be constructed by looking at the hierarchy resource pages in the FHIR specification, or by examining the column names in a full DataFrame output and converting the underscore notation to FHIRPath notation.

See Key FHIR Resources for more information on reading the FHIR specification.

Working with Multiple Resources (Advanced Level)

In this section, you’ll learn techniques for working with multiple FHIR resources simultaneously - a common requirement for clinical data analysis. Building on the previous sections, we’ll explore:

Handling elements with multiple values
Retrieving and linking related resources using _include and _revinclude parameters
Creating more targeted queries with resource-specific filters

Elements with multiple sub-values

There are multiple identifier[N].value values for each instance of Patient in this dataset.

# Instantiate and perform the FHIR search interaction in a single function call
df = search.steal_bundles_to_dataframe(
    resource_type="Patient",
    request_params={
        "_count": 10,  # Get 10 instances per page
    },
    num_pages=1,  # Get 1 page (so a total of 10 instances)
    fhir_paths=[("id", "identifier[0].value"), ("identifiers", "identifier.value")],
)
df

http://localhost:8080/fhir/Patient?_count=10

Query & Build DF (Patient):   0%|          | 0/1 [00:00<?, ?it/s]Query & Build DF (Patient): 100%|██████████| 1/1 [00:00<00:00, 483.27it/s]

	id	identifiers
0	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	[837e80f6-a7a5-77f8-36aa-c7b8ff002c4b, 837e80f...
1	39533e4a-f6f2-a144-ab37-6500460250dc	[39533e4a-f6f2-a144-ab37-6500460250dc, 39533e4...
2	68c3ae0b-e298-62b7-5d3a-7936fd998fe0	[68c3ae0b-e298-62b7-5d3a-7936fd998fe0, 68c3ae0...
3	df860bc2-1943-237f-7445-ed960a1ef069	[df860bc2-1943-237f-7445-ed960a1ef069, df860bc...
4	7d9ba758-f0b8-3fc3-befa-f0e8e8fb6935	[7d9ba758-f0b8-3fc3-befa-f0e8e8fb6935, 7d9ba75...

To convert to separate columns, you can do the following:

df.join(pd.DataFrame(df.pop("identifiers").values.tolist()).add_prefix("identifier_"))

	id	identifier_0	identifier_1	identifier_2	identifier_3	identifier_4
0	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	999-66-1459	S99915912	X42393955X
1	39533e4a-f6f2-a144-ab37-6500460250dc	39533e4a-f6f2-a144-ab37-6500460250dc	39533e4a-f6f2-a144-ab37-6500460250dc	999-77-3224	S99926454	X7704299X
2	68c3ae0b-e298-62b7-5d3a-7936fd998fe0	68c3ae0b-e298-62b7-5d3a-7936fd998fe0	68c3ae0b-e298-62b7-5d3a-7936fd998fe0	999-64-9980	S99927531	X39594248X
3	df860bc2-1943-237f-7445-ed960a1ef069	df860bc2-1943-237f-7445-ed960a1ef069	df860bc2-1943-237f-7445-ed960a1ef069	999-98-8428	S99941474	X6201645X
4	7d9ba758-f0b8-3fc3-befa-f0e8e8fb6935	7d9ba758-f0b8-3fc3-befa-f0e8e8fb6935	7d9ba758-f0b8-3fc3-befa-f0e8e8fb6935	999-38-2200	S99986860	X89611730X

This will give you separate identifier_0, identifier_1, … columns for each Patient.identifier[N] value.

Retrieving multiple resource types

FHIR-PYrate supports working with multiple resource types in a single query using the _include or _revinclude parameters. This allows you to retrieve related resources in a single API call.

See Using the FHIR API to Access Data for more information on constructing the parameters for FHIR search interactions.

Warning 1: Azure FHIR API Limits

Azure FHIR API limits _include and _revinclude parameters to 100 items. See the Azure documentation for more details.

Using `_revinclude` with FHIRPath

In this example, we retrieve Patient resources along with related Observation resources, and we use FHIRPath to select specific fields from each resource type:

# Retrieve patients and related observations
dfs = search.steal_bundles_to_dataframe(
    resource_type="Patient",
    request_params={
        # Get instances of Observation where `Observation.patient` refers to a fetched Patient instance
        "_revinclude": "Observation:patient",
        "_count": 10,  # Get 10 instances per page
    },
    num_pages=1,  # Get 1 page (so a total of 10 instances)
    fhir_paths=[
        # Common paths that could appear in either resource
        ("id", "id"),
        
        # Patient-specific paths
        ("patient_name", "name[0].family"),
        ("birth_date", "birthDate"),
        
        # Observation-specific paths
        ("observation_code", "code.coding[0].code"),
        ("observation_value", "valueQuantity.value"),
        ("observation_unit", "valueQuantity.unit")
    ]
)

# `dfs` is a dictionary where the key is the FHIR resource type, and the value is the DataFrame
# Split these into separate variables for easy access:
df_patients = dfs["Patient"]
df_observations = dfs["Observation"]

# Each DataFrame will only contain columns relevant to its resource type
# Empty columns are automatically removed from each DataFrame
print(f"Patient columns: {df_patients.columns.tolist()}")
print(f"Observation columns: {df_observations.columns.tolist()}")

# Look at the first row of each DataFrame
df_patients.head(1)
df_observations.head(1)

http://localhost:8080/fhir/Patient?_count=10&_revinclude=Observation:patient

Query & Build DF (Patient):   0%|          | 0/1 [00:00<?, ?it/s]Query & Build DF (Patient): 100%|██████████| 1/1 [00:00<00:00,  3.79it/s]

Patient columns: ['id', 'patient_name', 'birth_date']
Observation columns: ['id', 'observation_code', 'observation_value', 'observation_unit']

	id	observation_code	observation_value	observation_unit
0	793a4247-3b05-fc2f-7ffd-9fbc0ee3ef30	8302-2	180.0	cm

Using `trade_rows_for_dataframe` for more control

Sometimes you need more fine-grained control over how related resources are queried. In these cases, you can use trade_rows_for_dataframe to retrieve related resources based on data in an existing DataFrame:

df_observations2 = search.trade_rows_for_dataframe(
    df_patients,
    resource_type="Observation",
    request_params={
        "_count": "10",  # Get 10 instances per page
    },
    num_pages=1,
    # Load Observations where `Observation.subject` references the instance of Patient
    # identified by `id` in the `df_patients` DataFrame
    df_constraints={"subject": "id"},
    fhir_paths=[
        ("observation_id", "id"),
        ("patient", "subject.reference"),
        ("status", "status"),
        ("code", "code.coding[0].code"),
        ("code_display", "code.coding[0].display"),
        ("value", "valueQuantity.value"),
        ("value_units", "valueQuantity.unit"),
        ("datetime", "effectiveDateTime"),
    ],
)

# Look at the results
df_observations2.head(5)

Query & Build DF (Observation):   0%|          | 0/5 [00:00<?, ?it/s]                                                                     Query & Build DF (Observation):   0%|          | 0/5 [00:00<?, ?it/s]                                                                     Query & Build DF (Observation):   0%|          | 0/5 [00:00<?, ?it/s]                                                                     Query & Build DF (Observation):   0%|          | 0/5 [00:00<?, ?it/s]Query & Build DF (Observation):  60%|██████    | 3/5 [00:00<00:00, 22.44it/s]                                                                             Query & Build DF (Observation):  60%|██████    | 3/5 [00:00<00:00, 22.44it/s]

http://localhost:8080/fhir/Observation?_count=10&subject=837e80f6-a7a5-77f8-36aa-c7b8ff002c4b
http://localhost:8080/fhir/Observation?_count=10&subject=39533e4a-f6f2-a144-ab37-6500460250dc
http://localhost:8080/fhir/Observation?_count=10&subject=68c3ae0b-e298-62b7-5d3a-7936fd998fe0
http://localhost:8080/fhir/Observation?_count=10&subject=df860bc2-1943-237f-7445-ed960a1ef069

                                                                             Query & Build DF (Observation):  60%|██████    | 3/5 [00:00<00:00, 22.44it/s]Query & Build DF (Observation): 100%|██████████| 5/5 [00:00<00:00, 24.40it/s]

http://localhost:8080/fhir/Observation?_count=10&subject=7d9ba758-f0b8-3fc3-befa-f0e8e8fb6935

	observation_id	patient	status	code	code_display	value	value_units	datetime	id
0	42f36641-62c5-255d-108b-202fa4f8bb1b	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	8302-2	Body Height	178.50	cm	1993-10-24T11:39:54-04:00	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b
1	ec7344f4-d813-4979-f603-7e6ee8b9017e	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72514-3	Pain severity - 0-10 verbal numeric rating [Sc...	2.00	{score}	1993-10-24T11:39:54-04:00	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b
2	f7c4a971-3b98-ee63-7fe0-6b49c16313e7	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	29463-7	Body Weight	88.30	kg	1993-10-24T11:39:54-04:00	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b
3	0879305f-49f6-2811-cecf-4620de4e4ea7	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	39156-5	Body Mass Index	27.71	kg/m2	1993-10-24T11:39:54-04:00	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b
4	7af84d1b-518e-99e0-056d-73f45299d31b	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	85354-9	Blood Pressure	NaN	NaN	1993-10-24T11:39:54-04:00	837e80f6-a7a5-77f8-36aa-c7b8ff002c4b

The trade_rows_for_dataframe approach offers several advantages:

More precise control over query parameters for each related resource
Ability to process patient data row by row, useful for large datasets
Option to retain columns from the original DataFrame using the with_ref parameter

Filtering by resource attributes

When querying resources, you often need to filter by specific attributes. For example, you might want to retrieve all smoking status observations:

# Directly search for smoking status observations
df_observations2 = search.steal_bundles_to_dataframe(
    resource_type="Observation",
    request_params={
        "code": "http://loinc.org|72166-2",  # LOINC code for smoking status
        "_count": 20,  # Get more observations since we're not limiting by patient
    },
    num_pages=1,
    fhir_paths=[
        ("observation_id", "id"),
        ("patient", "subject.reference"),
        ("status", "status"),
        ("code", "code.coding[0].code"),
        ("code_display", "code.coding[0].display"),
        ("value", "valueCodeableConcept.coding[0].code"),
        ("value_display", "valueCodeableConcept.coding[0].display"),
        ("datetime", "effectiveDateTime"),
    ],
)

# Look at the first row of the Observations DataFrame
df_observations2.head(15)

http://localhost:8080/fhir/Observation?_count=20&code=http://loinc.org|72166-2

Query & Build DF (Observation):   0%|          | 0/1 [00:00<?, ?it/s]Query & Build DF (Observation): 100%|██████████| 1/1 [00:00<00:00, 85.37it/s]

	observation_id	patient	status	code	code_display	value	value_display	datetime
0	c23bd847-a47a-fed8-4955-c71620eb5edc	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	1993-10-24T11:39:54-04:00
1	442c5a19-f5da-2200-f99c-616f994df28d	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	1994-10-30T10:39:54-05:00
2	f9790b94-4ed5-8548-70d3-803e5998527a	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	1995-11-05T10:39:54-05:00
3	643e65fb-c9cd-7bd7-8dfa-268975c6d2c7	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	1996-11-10T10:39:54-05:00
4	f1e17e16-10b7-0b2a-4322-85139fb2a184	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	1997-11-16T10:39:54-05:00
5	4b8df0c9-2ffc-7208-9e51-070640634eb9	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	1998-11-22T10:39:54-05:00
6	d8a12bf3-0cd6-d707-e774-728cdf9be82e	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	1999-05-09T11:39:54-04:00
7	538423ac-cc0d-7bd6-accc-bbc24fd5d184	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	1999-11-28T10:39:54-05:00
8	95dbf328-bb46-fac1-9f20-368cb5fadfc4	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	2000-12-03T10:39:54-05:00
9	e5ad487f-74dd-e69a-c878-b1916ab3e2c4	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	2001-12-09T10:39:54-05:00
10	a4c23257-23f4-4ac4-58d3-82148409fe77	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	2002-09-01T11:39:54-04:00
11	7e87623f-adb3-f804-9604-11ab17c1dbb5	Patient/837e80f6-a7a5-77f8-36aa-c7b8ff002c4b	final	72166-2	Tobacco smoking status NHIS	8517006	Former smoker	2002-12-15T10:39:54-05:00
12	cfb4c4ac-c60f-5d73-942f-362fa3e94d41	Patient/39533e4a-f6f2-a144-ab37-6500460250dc	final	72166-2	Tobacco smoking status NHIS	266919005	Never smoker	2011-08-23T04:33:40-04:00
13	ba07b212-bc61-b40b-97f2-1f331ff30590	Patient/39533e4a-f6f2-a144-ab37-6500460250dc	final	72166-2	Tobacco smoking status NHIS	266919005	Never smoker	2011-09-20T04:33:40-04:00
14	ca96e4ce-bb34-fab0-0761-ee72c6ace340	Patient/39533e4a-f6f2-a144-ab37-6500460250dc	final	72166-2	Tobacco smoking status NHIS	266919005	Never smoker	2011-10-18T04:33:40-04:00

Note that when retrieving Observation resources, you’ll need to choose the appropriate data type for Observation.value[x] based on the type of observation. For quantitative observations, use valueQuantity.value, but for coded observations (like smoking status), use valueCodeableConcept.coding[0].code.

Summary and Next Steps

This tutorial has covered:

Beginner level: Connecting to a FHIR server and retrieving basic patient data
Intermediate level: Using FHIRPath to select specific columns and create focused DataFrames
Advanced level: Working with multiple resources, handling nested data, and performing filtered queries

To continue your learning:

Experiment with different resource types beyond Patient and Observation
Try more complex FHIRPath expressions to extract specific data elements
Combine data from multiple resources for comprehensive clinical analysis
Build visualization and analysis workflows with the retrieved data

Introduction

Learning Paths

Retrieving FHIR data (Beginner Level)

Selecting specific columns (Intermediate Level)

Working with Multiple Resources (Advanced Level)

Elements with multiple sub-values

Retrieving multiple resource types

Using _revinclude with FHIRPath

Using trade_rows_for_dataframe for more control

Filtering by resource attributes

Summary and Next Steps

Using `_revinclude` with FHIRPath

Using `trade_rows_for_dataframe` for more control