Node.JS and hdf5....
- silviamazzoni
- Mar 23
- 3 min read
Now that I know I want to process my data in a Lambda function in AWS (YES I LOVE LAMBDAS ON AWS!!!), I had one more question for ChatGPT: "can my lambda node js read hdf5 files?"
Here is an answer that saved me a lot of time.... I have enjoyed working in javascript since its interpreter is my browser.... glad I asked first!
Perfect! You're in a great position then — using Python in AWS Lambda to process the HDF5 files, and JavaScript in the browser to request only the relevant data is an ideal setup. Let's walk through a simple, working solution step-by-step:
✅ Your Architecture Overview:
HDF5 files are stored in S3
Python Lambda:
Reads the HDF5 file from S3
Extracts only the required data
Returns it via an API Gateway
Browser (JavaScript):
Sends a request to your API Gateway
Receives a lightweight JSON response
Uses it for display/processing
🧠 Step 1: Create Python Lambda Function to Read HDF5
📦 Requirements:
You’ll need the h5py library. Since it’s a C-based package, you’ll either:
Use a Lambda Layer that includes h5py, or
Deploy your Lambda using a Docker container (which gives you full control)
🚀 We'll go with the Lambda Layer method first (simpler if you're not using Docker yet).
🧱 Step 2: Build a Lambda Layer with h5py
Here’s how you can build it on your local machine (you need Docker or a Linux environment for compatibility):
A. Create directory structure:
mkdir -p python
cd python
pip install h5py -t .
cd ..
zip -r h5py-layer.zip python
If you’re on Windows, use WSL or a Docker container for the right architecture (Amazon Linux compatible).
B. Upload as a Lambda Layer:
Go to AWS Console → Lambda → Layers → Create layer
Name: h5py-layer
Upload h5py-layer.zip
Compatible runtimes: Python 3.9 (or the version you’ll use)
Click Create
🧑💻 Step 3: Write the Python Lambda Function
import json
import boto3
import h5py
import io
def lambda_handler(event, context):
# S3 parameters (could also come from event)
bucket = 'your-bucket-name'
key = 'your-file.h5'
# Fetch file from S3
s3 = boto3.client('s3')
response = s3.get_object(Bucket=bucket, Key=key)
file_content = response['Body'].read()
# Open as HDF5 and read dataset
with h5py.File(io.BytesIO(file_content), 'r') as hdf:
dataset = hdf['your_dataset_name'][:]
# Convert to Python list so it can be returned as JSON
data_list = dataset.tolist()
return {
'statusCode': 200,
'body': json.dumps({
'data': data_list
}),
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*' # Allow from browser
}
}
Replace 'your_dataset_name' with the actual path inside your .h5 file.
🌐 Step 4: Set Up API Gateway to Call Lambda
In AWS Console → API Gateway
Create a new HTTP API
Set up a route like /get-data
Connect it to your Lambda
Deploy and grab the API endpoint URL
🧪 Step 5: Call it from the Browser (JavaScript)
fetch('https://your-api-id.amazonaws.com/get-data')
.then(response => response.json())
.then(data => {
console.log('HDF5 data:', data.data); // Use as needed
})
.catch(err => console.error('Error fetching data:', err));
🔐 Bonus Tips:
For larger HDF5 files, consider passing the filename or key in the API request (e.g., via query string) so Lambda can load different files.
For security, lock down the API Gateway with auth (e.g., IAM, Cognito, or API keys) depending on your app's needs.
✅ Summary:
Task | Tool |
Read HDF5 | Python + h5py |
Host data | AWS Lambda |
Store files | Amazon S3 |
Serve via API | API Gateway |
Consume in browser | JavaScript (Fetch API) |
If you'd prefer a Docker-based deployment (gives more control over Python libraries), or want help creating a dynamic API where the browser can choose which dataset to load — happy to walk through that next!
Comments