Skip to content

Dify Integration

Overview

Dify is an open-source platform for developing production-ready LLM applications. It provides an intuitive interface that combines agentic AI workflows, RAG pipelines, agent capabilities, model management, and observability features. With its visual workflow builder and comprehensive model support, Dify enables rapid prototyping and deployment of AI applications without extensive coding.

Perfect for teams wanting to rapidly prototype and deploy AI applications with a visual interface while maintaining the flexibility to integrate custom models and workflows.

Key Features

  • Visual Workflow Builder: Drag-and-drop interface for creating complex AI workflows
  • Comprehensive Model Support: Seamless integration with 100+ LLMs including OpenAI-compatible APIs
  • RAG Pipeline: Built-in document processing and retrieval capabilities
  • Agent Framework: Create autonomous AI agents with custom tools and capabilities
  • Prompt IDE: Test and optimize prompts with model comparison features
  • Multi-tenant Architecture: Enterprise-ready with team collaboration features
  • API-First Design: RESTful APIs for all features enabling programmatic access

Use Cases

  • Build conversational AI applications and chatbots
  • Create AI-powered workflow automation systems
  • Develop RAG-based knowledge management solutions
  • Prototype and test AI applications before production deployment
  • Build multi-agent systems for complex task automation
  • Create custom AI assistants with specific domain knowledge

System Requirements

Before installation, ensure your system meets these minimum requirements:

  • CPU: 2 Core
  • RAM: 4 GB minimum
  • Storage: 10 GB free disk space
  • Docker: 20.10.17 or later
  • Docker Compose: 2.2.3 or later

Docker Installation

  1. Clone the Dify repository with the latest stable release:

    Terminal window
    git clone --branch "$(curl -s https://api.github.com/repos/langgenius/dify/releases/latest | jq -r .tag_name)" https://github.com/langgenius/dify.git
  2. Navigate to the Docker directory:

    Terminal window
    cd dify/docker
  3. Copy the environment configuration file:

    Terminal window
    cp .env.example .env
  4. Edit the .env file to customize your deployment:

    Terminal window
    # Core Service Configuration
    CONSOLE_API_URL=http://localhost
    SERVICE_API_URL=http://localhost/api
    APP_WEB_URL=http://localhost
    # Security - Change these in production!
    SECRET_KEY=your-secret-key-here
    INIT_PASSWORD=your-admin-password
    # Database Configuration
    DB_USERNAME=postgres
    DB_PASSWORD=difyai123456
    DB_HOST=db
    DB_PORT=5432
    DB_DATABASE=dify
    # Redis Configuration
    REDIS_HOST=redis
    REDIS_PORT=6379
    REDIS_PASSWORD=difyai123456
    # Vector Store (options: weaviate, qdrant, milvus, pgvector)
    VECTOR_STORE=weaviate
  5. Start Dify using Docker Compose:

    Terminal window
    docker compose up -d
  6. Verify all services are running:

    Terminal window
    docker compose ps

    You should see 11 services running including:

    • Core Services: api, worker, web
    • Dependencies: db, redis, nginx, weaviate, sandbox, ssrf_proxy
  7. Access the initialization page to set up admin account:

    http://localhost/install

RelaxAI Integration Setup

  1. Log in to Dify and navigate to SettingsModel Provider

  2. Click on OpenAI-API-compatible provider

  3. If not installed, click Install Plugin when prompted Dify plugin

  4. Configure the OpenAI-compatible provider with RelaxAI settings:

  5. Add RelaxAI models by clicking Add Model and configuring:

    Model Name: RelaxAI-Llama-4-Maverick
    Model ID: Llama-4-Maverick-17B-128E
    Type: LLM
    Capabilities: [CHAT]
    Context Length: 4096
    Max Tokens: 4096
  6. For additional models like DeepSeek-R1:

    Model Name: RelaxAI-DeepSeek-R1
    Model ID: DeepSeek-R1-0528
    Type: LLM
    Capabilities: [CHAT]
    Context Length: 4096
    Max Tokens: 4096

Dify model configuration

Advanced Configuration

Custom Docker Compose Configuration: For production deployments, modify docker-compose.yaml:

services:
api:
image: langgenius/dify-api:latest
environment:
# Scaling configuration
GUNICORN_WORKERS: 4
CELERY_WORKER_AMOUNT: 2
# Performance tuning
SQLALCHEMY_POOL_SIZE: 30
SQLALCHEMY_MAX_OVERFLOW: 60
# Security hardening
WEB_API_CORS_ALLOW_ORIGINS: "https://your-domain.com"
CONSOLE_CORS_ALLOW_ORIGINS: "https://your-domain.com"

Vector Database Options: Dify supports multiple vector stores. Configure in .env:

Terminal window
# Weaviate (default)
VECTOR_STORE=weaviate
WEAVIATE_ENDPOINT=http://weaviate:8080
# Qdrant
VECTOR_STORE=qdrant
QDRANT_URL=http://qdrant:6333
# PGVector
VECTOR_STORE=pgvector
PGVECTOR_HOST=pgvector
PGVECTOR_PORT=5432

SSL/HTTPS Configuration: For production, enable HTTPS:

Terminal window
# In .env file
NGINX_HTTPS_ENABLED=true
NGINX_SSL_PORT=443
NGINX_SERVER_NAME=your-domain.com
# Certbot auto-renewal is included

Workflow Development

Create AI workflows using Dify’s visual builder:

  1. Create New App: Choose between Chat, Completion, or Workflow app types

  2. Design Workflow:

    • Add nodes: LLM, Knowledge Retrieval, Code Execution, HTTP Request
    • Connect nodes to define data flow
    • Configure each node with specific parameters
  3. Integrate RelaxAI Models:

    Node: LLM
    Model: RelaxAI-Llama-4-Maverick
    Temperature: 0.7
    Max Tokens: 2048
    System Prompt: "You are a helpful assistant..."
  4. Test and Debug: Use the built-in testing interface to validate workflows

  5. Deploy: Get API endpoints for programmatic access

API Integration

Access Dify applications via API:

Terminal window
# Get API credentials from app settings
API_KEY="your-app-api-key"
APP_ID="your-app-id"
# Chat completions
curl -X POST "http://localhost/v1/chat/completions" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"app_id": "'$APP_ID'",
"query": "Hello, how can you help me?",
"conversation_id": "optional-session-id"
}'
# Workflow execution
curl -X POST "http://localhost/v1/workflows/run" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"app_id": "'$APP_ID'",
"inputs": {
"input_field": "value"
}
}'

Performance Optimization

  • Database Indexing: Ensure proper indexes on frequently queried columns
  • Redis Configuration: Increase memory allocation for heavy caching needs
  • Worker Scaling: Adjust CELERY_WORKER_AMOUNT based on workload
  • Vector Store Optimization: Use dedicated vector database for large-scale RAG
  • Model Caching: Enable model response caching for repeated queries
  • Resource Limits: Set Docker resource constraints to prevent resource exhaustion

Monitoring and Observability

Monitor Dify deployment health:

Terminal window
# View logs for specific service
docker compose logs -f api
# Monitor resource usage
docker stats
# Check service health
curl http://localhost/health
# Access metrics (if enabled)
curl http://localhost/metrics

Backup and Recovery

Regular backup procedures:

Terminal window
# Backup database
docker compose exec db pg_dump -U postgres dify > dify_backup_$(date +%Y%m%d).sql
# Backup volumes
tar -czf dify_volumes_$(date +%Y%m%d).tar.gz volumes/
# Restore database
docker compose exec -T db psql -U postgres dify < dify_backup.sql

Troubleshooting

  • Services not starting: Check port conflicts and ensure ports 80, 443, 5432, 6379 are available
  • Model provider errors: Verify API credentials and endpoint URLs are correct
  • Memory issues: Increase Docker memory allocation, minimum 4GB recommended
  • Vector store connection failed: Ensure selected vector database service is running
  • SSL certificate issues: Check Certbot container logs and domain configuration
  • Slow performance: Scale worker instances and optimize database queries
  • Container networking: Use host.docker.internal for host machine access

Best Practices

  • Always use environment variables for sensitive configuration
  • Regularly update to latest Dify version for security patches
  • Implement proper backup strategies before production deployment
  • Use dedicated vector database for production RAG applications
  • Monitor resource usage and scale services accordingly
  • Enable HTTPS for production deployments
  • Implement rate limiting for public-facing APIs
  • Use persistent volumes for data that needs to survive container restarts

Example Usage: Building a RAG-Powered Documentation Assistant

This section demonstrates how to build a sophisticated RAG system using Dify with RelaxAI models. We’ll create a documentation assistant that can answer questions with context-aware responses.

Step 1: Create Knowledge Base

  1. Navigate to Knowledge Section
  • Log in to Dify dashboard
  • Click “Knowledge” in the left sidebar
  • Click “Create Knowledge”
  1. Configure Knowledge Base

    Name: Product Documentation
    Description: Technical documentation for our product
    Permission: Only me (or Team based on your needs)
  2. Upload Documents

    • Click “Import” and select your files (PDF, TXT, MD, DOCX)
    • Configure text processing:
    Segmentation Settings:
    Mode: Automatic
    Chunk Size: 500 tokens
    Chunk Overlap: 50 tokens
    Preprocessing Rules:
    - Remove extra spaces: Yes
    - Remove URLs/emails: No (keep for technical docs)
  3. Set Embedding Model

    Embedding Model: Mistral-7b-embedding
    Vector Database: Weaviate
    Retrieval Settings:
    Top K: 5
    Score Threshold: 0.7

Step 2: Create Workflow Application

  1. Create New App

    • Go to “Studio”“Create App”
    • Select “Workflow” type
    • Name: “Documentation Assistant”
  2. Design the Workflow

    Visual workflow structure:

    [Start] → [Knowledge Retrieval] → [Context Processing] → [LLM Generation] → [Output] → [End]
  3. Configure Workflow Nodes

    Start Node (Input):

    Variables:
    - user_query:
    Type: String
    Required: true
    Description: "User's question"

    Knowledge Retrieval Node:

    Node Type: Knowledge Retrieval
    Knowledge Base: Product Documentation
    Query: {{user_query}}
    Top K: 5
    Score Threshold: 0.7
    Reranking Model: RelaxAI-DeepSeek-R1

    Context Processing Node (Code):

    def main(retrieved_docs, user_query):
    if not retrieved_docs:
    return {"context": "No relevant documentation found.", "has_context": False}
    formatted_context = []
    for idx, doc in enumerate(retrieved_docs[:3]):
    formatted_context.append(f"""
    Document {idx + 1}:
    Source: {doc.metadata.get('source', 'Unknown')}
    Content: {doc.content}
    ---""")
    return {
    "context": "\n".join(formatted_context),
    "has_context": True,
    "source_count": len(retrieved_docs)
    }

    LLM Generation Node:

    Node Type: LLM
    Model: RelaxAI-Llama-4-Maverick
    Temperature: 0.3
    Max Tokens: 1500
    System Prompt: |
    You are a helpful documentation assistant. Answer questions based
    ONLY on the provided context. If the context doesn't contain the
    answer, say so clearly. Cite sources when referencing information.
    User Prompt: |
    Context: {{context}}
    Question: {{user_query}}
    Please provide a comprehensive answer based on the documentation.

    Output Formatting Node:

    Node Type: Template
    Template: |
    ## Answer
    {{llm_response}}
    ---
    **Sources Consulted:** {{source_count}} documents

Step 3: Test the Application

  1. Run Test Queries
  • Click “Preview” in the workflow editor and test with sample questions:
    • “How do I configure authentication?”
    • “What are the system requirements?”
    • “How to troubleshoot connection errors?”
  1. Verify Responses
    • Check accuracy against source documents
    • Ensure proper source citation
    • Validate handling of out-of-context questions

Step 4: Deploy and Integrate

  1. Publish the Workflow

    • Click “Publish” in the editor
    • Navigate to “API Access”
    • Copy API credentials
  2. API Integration Example

    import requests
    class DifyDocAssistant:
    def __init__(self, api_key, app_id):
    self.api_key = api_key
    self.app_id = app_id
    self.base_url = "http://localhost/v1"
    def ask(self, question):
    endpoint = f"{self.base_url}/workflows/run"
    headers = {
    "Authorization": f"Bearer {self.api_key}",
    "Content-Type": "application/json"
    }
    payload = {
    "app_id": self.app_id,
    "inputs": {"user_query": question}
    }
    response = requests.post(endpoint, headers=headers, json=payload)
    return response.json()["outputs"]["formatted_response"]
    # Usage
    assistant = DifyDocAssistant("your-api-key", "your-app-id")
    answer = assistant.ask("How do I set up authentication?")
    print(answer)
  3. JavaScript/Node.js Integration

    const axios = require('axios');
    class DifyAssistant {
    constructor(apiKey, appId) {
    this.apiKey = apiKey;
    this.appId = appId;
    this.baseUrl = 'http://localhost/v1';
    }
    async ask(question) {
    try {
    const response = await axios.post(
    `${this.baseUrl}/workflows/run`,
    {
    app_id: this.appId,
    inputs: { user_query: question }
    },
    {
    headers: {
    'Authorization': `Bearer ${this.apiKey}`,
    'Content-Type': 'application/json'
    }
    }
    );
    return response.data.outputs.formatted_response;
    } catch (error) {
    console.error('Error:', error);
    throw error;
    }
    }
    }
    // Usage
    const assistant = new DifyAssistant('your-api-key', 'your-app-id');
    const answer = await assistant.ask('What are the API rate limits?');
    console.log(answer);

Step 5: Advanced Features

  1. Hybrid Search Implementation

    Add a code node for enhanced retrieval:

    def hybrid_search(query, knowledge_base):
    # Semantic search
    semantic_results = knowledge_base.semantic_search(query, top_k=10)
    # Keyword search
    keyword_results = knowledge_base.keyword_search(query, top_k=10)
    # Combine and rerank
    combined = merge_results(semantic_results, keyword_results)
    return rerank_with_model(combined, query, "RelaxAI-DeepSeek-R1")[:5]
  2. Conversation Memory

    Add conversation tracking:

    Node Type: Variable Assigner
    Operation: Store conversation
    Variables:
    conversation_history:
    - query: {{user_query}}
    - response: {{llm_response}}
    - timestamp: {{current_timestamp}}
    Storage: Redis cache (30-minute TTL)
  3. Query Expansion

    Improve retrieval accuracy:

    def expand_query(original_query):
    prompt = f"Generate 3 alternative phrasings for: {original_query}"
    expanded = llm_call("RelaxAI-Llama-4-Maverick", prompt, temperature=0.7)
    return [original_query] + expanded.split('\n')

Performance Optimization Tips

  • Caching: Enable response caching for frequent queries
  • Batch Processing: Process multiple documents in parallel during ingestion
  • Model Selection: Use RelaxAI-DeepSeek-R1 for reranking, Llama-4-Maverick for generation
  • Chunk Size: Experiment with 300-700 tokens based on your content type
  • Retrieval Settings: Adjust Top K and threshold based on document corpus size

Monitoring and Analytics

Track these metrics for continuous improvement:

Retrieval Metrics:
- Average retrieval score
- Query success rate
- Document coverage
Generation Metrics:
- Response time
- Token usage per query
- User satisfaction scores
System Metrics:
- API latency
- Cache hit rate
- Error rate

This example demonstrates the power of combining Dify’s visual workflow builder with RelaxAI’s models to create production-ready AI applications without extensive coding.

Resources