Skip to content

Quick Start Guide

This guide will help you get started with BioEPIC Skills quickly.

Setup

1. Install the Package

pip install bioepic_skills

2. Configure Environment Variables

Create a .env file in your project directory:

cp .env.example .env

Edit .env with your credentials:

ENV=prod
CLIENT_ID=your_client_id_here
CLIENT_SECRET=your_client_secret_here

Basic Usage

Simple API Query

from bioepic_skills.api_search import APISearch
from bioepic_skills.data_processing import DataProcessing

# Create clients
api_client = APISearch(collection_name="samples")
dp = DataProcessing()

# Get records
records = api_client.get_records(max_page_size=10)
print(f"Retrieved {len(records)} records")

# Convert to DataFrame
df = dp.convert_to_df(records)
print(df.head())

Search by Attribute

# Search for specific records
results = api_client.get_record_by_attribute(
    attribute_name="type",
    attribute_value="biological_sample",
    max_page_size=50,
    all_pages=True
)

print(f"Found {len(results)} matching records")

Get Record by ID

# Retrieve a specific record
record = api_client.get_record_by_id("sample-12345")
print(record)

Using Authentication

For endpoints that require authentication:

from bioepic_skills.auth import BioEPICAuth
from bioepic_skills.api_search import APISearch
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Initialize authentication
auth = BioEPICAuth(
    client_id=os.getenv("CLIENT_ID"),
    client_secret=os.getenv("CLIENT_SECRET")
)

# Verify credentials
if auth.has_credentials():
    print("Authentication configured successfully")

    # Get token
    token = auth.get_token()
    print("Token acquired")

Data Processing Examples

Extract Specific Fields

from bioepic_skills.data_processing import DataProcessing

dp = DataProcessing()

# Extract IDs from records
ids = dp.extract_field(records, "id")
print(f"Extracted {len(ids)} IDs")

Build Custom Filters

# Build a MongoDB-style filter
filter_query = dp.build_filter(
    {"name": "test", "status": "active"},
    exact_match=False
)

# Use the filter in a query
filtered_records = api_client.get_record_by_filter(filter_query)

Merge DataFrames

# Merge two DataFrames on a common column
merged_df = dp.merge_dataframes("id", df1, df2)
print(f"Merged dataframe shape: {merged_df.shape}")

Split Lists into Chunks

# Split a large list into smaller chunks
large_list = list(range(250))
chunks = dp.split_list(large_list, chunk_size=100)
print(f"Split into {len(chunks)} chunks")

Debugging

Enable debug logging to see detailed information:

import logging

logging.basicConfig(level=logging.DEBUG)

# Now run your code - you'll see detailed debug output
api_client = APISearch(collection_name="samples")
records = api_client.get_records(max_page_size=5)

Common Patterns

Pagination - Get All Pages

# Get all pages of results
all_records = api_client.get_records(
    max_page_size=100,
    all_pages=True
)
print(f"Retrieved {len(all_records)} total records")

Filter and Export

# Filter, convert to DataFrame, and export
results = api_client.get_record_by_attribute(
    attribute_name="category",
    attribute_value="research"
)

df = dp.convert_to_df(results)
df.to_csv("research_samples.csv", index=False)
print("Data exported to research_samples.csv")

Next Steps