Skip to main content

Overview

Viewing the Python integration. For the TypeScript version of this guide, see the TypeScript LangChain integration.
Valyu integrates seamlessly with LangChain as a search tool, allowing you to enhance your AI agents and RAG applications with real-time web search and proprietary data sources. The integration provides LLM-ready context from multiple sources including web pages, academic journals, financial data, and more. The package includes two main tools:
  • ValyuSearchTool: Deep search operations with comprehensive parameter control
  • ValyuContentsTool: Extract clean content from specific URLs

Installation

Install the official LangChain Valyu package:
pip install -U langchain-valyu
Configure credentials by setting the following environment variable:
export VALYU_API_KEY="your-valyu-api-key-here"
Or set it programmatically:
import os
os.environ["VALYU_API_KEY"] = "your-valyu-api-key-here"
For agent examples, you’ll also need:
export ANTHROPIC_API_KEY="your-anthropic-api-key"  # For Claude examples
export OPENAI_API_KEY="your-openai-api-key"        # For OpenAI examples

Free Credits

Get your API key with $10 credit from the Valyu Platform.

Basic Usage

import os
from langchain_valyu import ValyuSearchTool

# Set your API key
os.environ["VALYU_API_KEY"] = "your-api-key-here"

# Initialize the search tool
tool = ValyuSearchTool()

# Perform a search
search_results = tool._run(
    query="What are agentic search-enhanced large reasoning models?",
    search_type="all",  # "all", "web", or "proprietary"
    max_num_results=5,
    relevance_threshold=0.5,
    max_price=30.0
)

print("Search Results:", search_results.results)

Using ValyuContentsTool for Content Extraction

Extract clean, structured content from specific URLs:
import os
from langchain_valyu import ValyuContentsTool

# Set your API key
os.environ["VALYU_API_KEY"] = "your-api-key-here"

# Initialize the contents tool
contents_tool = ValyuContentsTool()

# Extract content from URLs
urls = [
    "https://arxiv.org/abs/2301.00001",
    "https://example.com/article",
]

extracted_content = contents_tool._run(urls=urls)
print("Extracted Content:", extracted_content.results)

# Print individual results
for result in extracted_content.results:
    print(f"URL: {result['url']}")
    print(f"Title: {result['title']}")
    print(f"Content: {result['content'][:200]}...")
    print(f"Status: {result['status']}")
    print("---")

Using with LangChain Agents

The most powerful way to use Valyu is within LangChain agents, where the AI can dynamically decide when and how to search:
pip install langchain-anthropic langgraph
import os
from langchain_valyu import ValyuSearchTool
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage

# Set API keys
os.environ["VALYU_API_KEY"] = "your-valyu-api-key"
os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-api-key"

# Initialize components
llm = ChatAnthropic(model="claude-sonnet-4-20250514")
valyu_search_tool = ValyuSearchTool()

# Create agent with Valyu search capability
agent = create_react_agent(llm, [valyu_search_tool])

# Use the agent
user_input = "What are the key factors driving recent stock market volatility, and how do macroeconomic indicators influence equity prices across different sectors?"

for step in agent.stream(
    {"messages": [HumanMessage(content=user_input)]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

Advanced Configuration

Search Parameters

The ValyuSearchTool supports comprehensive search parameters for fine-tuned control:
from langchain_valyu import ValyuSearchTool

tool = ValyuSearchTool()

# Advanced search with all available parameters
results = tool._run(
    query="quantum computing breakthroughs 2024",
    search_type="proprietary",  # "all", "web", or "proprietary"
    max_num_results=10,  # 1-20 results for standard API keys, up to 100 with a [special API key](http://platform.valyu.ai/user/account/apikeys?req=increase_results)
    relevance_threshold=0.6,  # 0.0-1.0 relevance score
    max_price=30.0,  # Maximum cost in dollars
    is_tool_call=True,  # Optimized for LLM consumption
    start_date="2024-01-01",  # Time filtering (YYYY-MM-DD)
    end_date="2024-12-31",
    included_sources=["valyu/valyu-arxiv", "valyu/valyu-pubmed"],  # Include specific sources
    excluded_sources=["example.com"],  # Exclude sources
    response_length="medium",  # "short", "medium", "large", "max", or int
    country_code="US",  # 2-letter ISO country code
    fast_mode=False,  # Enable for faster but shorter results
)

Source Filtering

Control which sources are included or excluded from your search:
# Include only academic sources
academic_results = tool._run(
    query="machine learning research 2024",
    search_type="proprietary",
    included_sources=["arxiv.org", "pubmed.ncbi.nlm.nih.gov", "ieee.org"],
    max_num_results=8
)

# Exclude specific domains
filtered_results = tool._run(
    query="AI policy developments",
    search_type="web",
    excluded_sources=["example.com", "example.org", "example.net"],
    max_num_results=10
)

Multi-Agent Workflows

Use Valyu in complex multi-agent systems:
from langchain_valyu import ValyuSearchTool
from langchain_anthropic import ChatAnthropic
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage

# Create specialized research agent
research_llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0.1)
research_tool = ValyuSearchTool()

research_agent = create_react_agent(
    research_llm,
    [research_tool]
)

# Create analysis agent
analysis_llm = ChatOpenAI(model="gpt-5", temperature=0.3)
analysis_agent = create_react_agent(
    analysis_llm,
    [research_tool]
)

# Coordinate agents for complex queries
research_query = "Find recent papers on transformer architecture improvements"
analysis_query = "Analyze market trends in AI chip demand"

# Execute research agent
for step in research_agent.stream(
    {"messages": [HumanMessage(content=research_query)]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

# Execute analysis agent
for step in analysis_agent.stream(
    {"messages": [HumanMessage(content=analysis_query)]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

Example Applications

Financial Research Assistant

from langchain_valyu import ValyuSearchTool
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage, SystemMessage

# Create financial research agent
financial_llm = ChatAnthropic(model="claude-sonnet-4-20250514")
valyu_tool = ValyuSearchTool()

financial_agent = create_react_agent(financial_llm, [valyu_tool])

# Query financial markets with system context
query = "What are the latest developments in cryptocurrency regulation and their impact on institutional adoption?"

system_context = SystemMessage(content="""You are a financial research assistant. Use Valyu to search for:
- Real-time market data and news
- Academic research on financial models
- Economic indicators and analysis

Always cite your sources and provide context about data recency.""")

for step in financial_agent.stream(
    {"messages": [system_context, HumanMessage(content=query)]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

Academic Research Agent

from langchain_valyu import ValyuSearchTool

# Configure for academic research
academic_tool = ValyuSearchTool()

# Search academic sources specifically
academic_results = academic_tool._run(
    query="CRISPR gene editing safety protocols",
    search_type="proprietary",  # Focus on academic datasets
    max_num_results=8,
    relevance_threshold=0.6,
)

print("Academic Sources Found:", len(academic_results.results))
for result in academic_results.results:
    print(f"Title: {result['title']}")
    print(f"Source: {result['source']}")
    print(f"Relevance: {result['relevance_score']}")
    print("---")

Best Practices

1. Cost Optimization

# Set appropriate price limits based on use case
tool = ValyuSearchTool()

# For quick lookups
quick_search = tool._run(
    query="current bitcoin price",
    max_price=30.0,  # Lower cost for simple queries
    max_num_results=3
)

# For comprehensive research
detailed_search = tool._run(
    query="comprehensive analysis of renewable energy trends",
    max_price=50.0,  # Higher budget for complex queries
    max_num_results=15,
    search_type="all"
)

2. Search Type Selection

# Web search for current events
web_results = tool._run(
    query="latest AI policy developments",
    search_type="web",
    max_num_results=5
)

# Proprietary search for academic research
academic_results = tool._run(
    query="machine learning interpretability methods",
    search_type="proprietary",
    max_num_results=8
)

# Combined search for comprehensive coverage
all_results = tool._run(
    query="climate change economic impact",
    search_type="all",
    max_num_results=10
)

3. Error Handling and Fallbacks

from langchain_valyu import ValyuSearchTool

def robust_search(query: str, fallback_query: str = None):
    tool = ValyuSearchTool()

    try:
        # Primary search
        results = tool._run(
            query=query,
            max_price=30.0,
            max_num_results=5
        )
        return results
    except Exception as e:
        print(f"Primary search failed: {e}")

        if fallback_query:
            try:
                # Fallback with simpler query
                results = tool._run(
                    query=fallback_query,
                    max_price=30.0,
                    max_num_results=3,
                    search_type="web"
                )
                return results
            except Exception as e2:
                print(f"Fallback search also failed: {e2}")
                return "Search unavailable"

        return "Search failed"

# Usage
results = robust_search(
    "complex quantum entanglement applications",
    "quantum entanglement basics"
)

4. Agent System Messages

from langchain_core.messages import SystemMessage, HumanMessage

# Optimize agent behavior with good system messages
system_message = SystemMessage(content="""You are an AI research assistant with access to Valyu search.

SEARCH GUIDELINES:
- Use search_type="proprietary" for academic/scientific queries
- Use search_type="web" for current events and general web content
- Use search_type="news" for news articles only
- Use search_type="all" for comprehensive research
- Set higher relevance_threshold (0.6+) for precise results
- Use category parameter to guide search context
- Do not use search operators (e.g., site:, OR, AND, quotes). Use natural keyword queries instead.
- Always cite sources from search results

RESPONSE FORMAT:
- Provide direct answers based on search results
- Include source citations with URLs when available
- Mention publication dates for time-sensitive information
- Indicate if information might be outdated""")

valyu_tool = ValyuSearchTool()
agent = create_react_agent(llm, [valyu_tool])

# Use the agent with system context
for step in agent.stream(
    {"messages": [system_message, HumanMessage(content="Your query here")]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()
For complete query writing guidelines and how to use API parameters instead, see the Prompting Guide.

API Reference

For complete parameter documentation, see the Valyu API Reference.

ValyuSearchTool Parameters

  • query (required): Natural language search query
  • search_type: "all", "web", "proprietary", or "news" (default: “all”)
  • max_num_results: 1-20 results for standard API keys, up to 100 with a special API key (default: 5)
  • relevance_threshold: 0.0-1.0 relevance score (default: 0.5)
  • max_price: Maximum cost in dollars per thousand retrievals (CPM). Only applies when provided. If not provided, adjusts automatically based on search type and max number of results.
  • is_tool_call: Optimize for LLM consumption (default: true)
  • start_date/end_date: Time filtering in YYYY-MM-DD format (optional)
  • included_sources: List of URLs/domains to include (optional)
  • excluded_sources: List of URLs/domains to exclude (optional)
  • response_length: Content length - int, “short”, “medium”, “large”, “max” (optional)
  • country_code: 2-letter ISO country code for geo-bias (optional)
  • fast_mode: Enable for faster but shorter results (default: false)

ValyuContentsTool Parameters

  • urls (required): List of URLs to extract content from (max 10 per request)

Additional Resources

LangChain Valyu Tool

Official LangChain integration documentation

API Reference

Complete Valyu API documentation

LangGraph Agents

Build advanced agent workflows

Get API Key

Sign up for free $10 credit