Skip to content

Latest commit

 

History

History
247 lines (183 loc) · 7.37 KB

File metadata and controls

247 lines (183 loc) · 7.37 KB

🏗️ Environment Setup

Time to Complete: 2-3 hours

This guide walks you through setting up your Azure Databricks environment for LLM development and deployment, integrating with Azure AI Foundry, and configuring GitHub for seamless CI/CD.

Table of Contents

Prerequisites

Before starting, ensure you have:

  • An Azure subscription with Owner or Contributor access
  • A GitHub account with repository creation permissions
  • Terraform installed locally (if using Infrastructure as Code)
  • Azure CLI installed and configured

Azure Databricks Setup

Step 1: Create a Databricks Workspace

  1. Log in to the Azure Portal

  2. Navigate to "Create a resource" > Search for "Azure Databricks"

  3. Fill in the basics:

    • Subscription: Your Azure subscription
    • Resource Group: Create new or select existing
    • Workspace Name: llm-mlops-workspace
    • Region: Choose a region supporting the ML runtime (e.g., East US)
    • Pricing Tier: Premium (recommended for MLOps features)
  4. Review + Create > Create

⚠️ Note: The Premium tier is required for advanced security features and access controls.

Step 2: Configure Databricks Clusters

  1. Launch your Databricks workspace
  2. Navigate to Compute > Create Cluster
  3. Configure your development cluster:
    • Cluster Name: llm-dev-cluster
    • Cluster Mode: Single Node or Standard
    • Databricks Runtime Version: 13.3 LTS ML or newer
    • Node Type: Select an appropriate node type with GPUs if needed for LLM training
    • Auto-termination: 120 minutes (recommended for cost savings)
{
  "cluster_name": "llm-dev-cluster",
  "spark_version": "13.3.x-gpu-ml-scala2.12",
  "node_type_id": "Standard_NC6s_v3",
  "driver_node_type_id": "Standard_NC6s_v3",
  "autotermination_minutes": 120,
  "spark_conf": {
    "spark.databricks.cluster.profile": "singleNode",
    "spark.master": "local[*]"
  },
  "custom_tags": {
    "Environment": "Development"
  }
}
  1. For production workloads, create a separate cluster:
    • Cluster Name: llm-prod-cluster
    • Cluster Mode: Standard (multi-node)
    • Autoscaling: Enabled, min: 2, max: 8 workers (adjust based on your needs)

Step 3: Set Up ML Runtime Components

  1. Create a new notebook to validate the ML runtime
  2. Run the following code to validate the environment:
# Verify installed versions
import mlflow
import torch
import transformers

print(f"MLflow version: {mlflow.__version__}")
print(f"PyTorch version: {torch.__version__}")
print(f"Transformers version: {transformers.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU count: {torch.cuda.device_count()}")
    print(f"GPU name: {torch.cuda.get_device_name(0)}")

Azure AI Foundry Integration

Step 1: Enable Azure OpenAI Service

  1. In the Azure Portal, search for "Azure OpenAI"
  2. Create a new Azure OpenAI resource
    • Subscription: Same as your Databricks workspace
    • Resource Group: Same as your Databricks workspace
    • Region: Choose an available region (e.g., East US)
    • Name: llm-openai-service
    • Pricing Tier: Standard S0

Step 2: Deploy Foundation Models

  1. Go to your Azure OpenAI resource
  2. Select "Model Deployments"
  3. Deploy the following models:
    • Text Embedding Model: text-embedding-ada-002 (for embeddings)
    • LLM Model: gpt-4 or gpt-35-turbo (for completions)

Step 3: Configure API Access

  1. In your Azure OpenAI resource, go to "Keys and Endpoint"
  2. Copy the endpoint and key for later use in Databricks

GitHub Configuration

Step 1: Repository Setup

  1. Create a new GitHub repository for your LLM project
  2. Clone the repository locally
  3. Set up the basic project structure (follow the structure in the main README)

Step 2: GitHub Copilot Configuration

  1. Navigate to your GitHub account settings
  2. Go to "Copilot" in the sidebar
  3. Enable GitHub Copilot for your account
  4. Install the Copilot extension in your IDE (VS Code, JetBrains IDEs, etc.)

Step 3: GitHub Advanced Security

  1. In your repository, go to "Settings" > "Security & analysis"
  2. Enable:
    • Dependency graph
    • Dependabot alerts
    • Dependabot security updates
    • Code scanning

Environment Management

Creating Development, Staging, and Production Environments

For each environment (development, staging, production):

  1. Create separate Databricks workspaces or use a single workspace with separate folders
  2. Set up environment-specific clusters with appropriate security and scaling configurations
  3. Use Databricks Repos to link code from GitHub to each environment
  4. Configure environment-specific variables in each workspace
# Example script to set up workspace folders
databricks workspace mkdirs /Development
databricks workspace mkdirs /Staging
databricks workspace mkdirs /Production

Secrets Management

Databricks Secret Scopes

  1. Create a secret scope in Databricks:
databricks secrets create-scope --scope llm-secrets
  1. Add your Azure OpenAI and other service credentials:
databricks secrets put --scope llm-secrets --key azure-openai-key --string-value "YOUR_OPENAI_KEY"
databricks secrets put --scope llm-secrets --key azure-openai-endpoint --string-value "YOUR_OPENAI_ENDPOINT"
  1. Access secrets in notebooks:
openai_key = dbutils.secrets.get(scope="llm-secrets", key="azure-openai-key")
openai_endpoint = dbutils.secrets.get(scope="llm-secrets", key="azure-openai-endpoint")

Key Vault Integration (Recommended for Production)

  1. Create an Azure Key Vault in the same resource group
  2. Add your secrets to Key Vault
  3. Set up Databricks to access Key Vault using Managed Identity

Validating Your Setup

Connection Testing

Run the following tests to verify your setup:

  1. Test Azure OpenAI connection:
import openai
import os

# Configure OpenAI API
openai.api_type = "azure"
openai.api_key = dbutils.secrets.get(scope="llm-secrets", key="azure-openai-key")
openai.api_base = dbutils.secrets.get(scope="llm-secrets", key="azure-openai-endpoint")
openai.api_version = "2023-05-15"

# Test connection
response = openai.ChatCompletion.create(
    engine="gpt-35-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello, are you working correctly?"}
    ]
)

print(response.choices[0].message.content)
  1. Test GitHub integration:
# If using Databricks Repos
%sh
git status

End-to-End Test

Create a simple notebook that:

  1. Loads data from storage
  2. Processes it using a simple model
  3. Logs the results using MLflow
  4. Serves predictions

This will validate that all components are working together properly.

Next Steps

Once your environment is set up, you can proceed to:

  1. LLM Development Workflows
  2. GitHub Platform Integration
  3. Testing and Validation Strategies