We build custom applications 5x faster and cheaper đ
Book a Free Consultation
Stuck on an error? Book a 30-minute call with an engineer and get a direct fix + next steps. No pressure, no commitment.
Baidu Baike search can be integrated with OpenClaw by implementing a secure, external search API (either calling Baiduâs official API if you have one, or a responsibly written scraper/proxy if not), then exposing that API to an OpenClaw skill that you install/configure via ClawHub; keep the search service, storage, rate-limiting, and credentials outside the agent runtime, configure API keys or OAuth in ClawHub, and have the skill make explicit authenticated REST calls to your proxy; add caching, logging, and request validation so the agent remains stateless and reliable.
// Minimal example: Node.js Express search proxy that calls an upstream Baidu endpoint
const express = require('express');
const fetch = require('node-fetch');
const app = express();
app.use(express.json());
const BAIDU_API_URL = process.env.BAIDU_API_URL; // <b>//</b> set to official API endpoint if available
const BAIDU_API_KEY = process.env.BAIDU_API_KEY; // <b>//</b> or token for upstream
app.get('/api/baike/search', async (req, res) => {
const q = req.query.q;
if (!q) return res.status(400).json({ error: 'q parameter required' });
try {
// <b>//</b> If you have an official Baidu JSON API:
const url = `${BAIDU_API_URL}?q=${encodeURIComponent(q)}&apikey=${encodeURIComponent(BAIDU_API_KEY)}`;
const upstream = await fetch(url, { method: 'GET' });
if (!upstream.ok) {
const text = await upstream.text();
return res.status(502).json({ error: 'upstream error', status: upstream.status, body: text });
}
const data = await upstream.json();
// <b>//</b> Normalize upstream fields to a stable shape:
const normalized = (data.items || []).map(item => ({
title: item.title || item.name,
snippet: item.snippet || item.summary,
url: item.url || item.link
}));
return res.json({ query: q, results: normalized });
} catch (err) {
console.error('search error', err);
return res.status(500).json({ error: 'internal_error' });
}
});
app.listen(process.env.PORT || 3000, () => {
console.log('Baike proxy listening');
});
// Example code that would run in a skill runtime: perform a call to your proxy
// <b>//</b> This code uses fetch to call your external service. Adapt to the skill runtime pattern you use.
const fetch = require('node-fetch');
async function searchBaike(query) {
const base = process.env.BAIKE_PROXY_URL; // <b>//</b> set in ClawHub environment for the skill
const key = process.env.BAIKE_PROXY_KEY; // <b>//</b> stored as a secret in ClawHub
const url = `${base}/api/baike/search?q=${encodeURIComponent(query)}`;
const resp = await fetch(url, {
method: 'GET',
headers: {
'Authorization': `Bearer ${key}`,
'Accept': 'application/json'
}
});
if (!resp.ok) {
const body = await resp.text();
throw new Error(`proxy error ${resp.status}: ${body}`);
}
const json = await resp.json();
return json.results;
}
// <b>//</b> Example usage inside the skill
(async () => {
const results = await searchBaike('äșșć·„æșèœ');
console.log(results);
})();
Concluding summary: build a secured external Baike search service (official API or careful scraper), expose it as a stable JSON API, keep all credentials and state outside the OpenClaw agent runtime, install/configure the skill via ClawHub with secrets injected, and verify via thorough testing and monitoring.
Speak oneâonâone with a senior engineer about your noâcode app, migration goals, and budget. In just half an hour youâll leave with clear, actionable next stepsâno strings attached.
1
Baidu Baike returns HTTP 302/403 or JavaScript redirects because its front-end and anti-bot systems detect non-browser requests (missing cookies, headers, JS execution, or suspicious IP/rate patterns) and push you to login/captcha pages or a JS challenge. Your OpenClaw Spider (an agent skill making plain HTTP calls) wonât automatically run page JS or solve captchas, so it hits redirects or forbids access.
Common causes and practical steps:
2
Always treat the HTTP body as raw bytes, detect encoding from the Content-Type header, the HTML meta tag, or a charset detector, then decode the bytes with the correct codec (map GB2312âGBK). In an OpenClaw Spider/Parser read response bytes, pick encoding, decode to UTFâ8 string, then pass that string to your parser.
const axios = require('axios');
const iconv = require('iconv-lite');
const jschardet = require('jschardet');
const cheerio = require('cheerio');
async function fetchAndParse(url) {
const res = await axios.get(url, { responseType: 'arraybuffer' });
const buf = Buffer.from(res.data);
// prefer header charset
const headerCharset = (res.headers['content-type']||'').match(/charset=([^;]+)/i)?.[1];
let enc = headerCharset || jschardet.detect(buf).encoding || 'utf-8';
enc = enc.toLowerCase().replace('gb2312','gbk');
const text = iconv.decode(buf, enc);
const $ = cheerio.load(text);
// now extract content reliably
return $('body').text();
}
3
Use the OpenClaw Downloader Middleware to detect Baidu Baike responses that require JS or show captchas, then route those requests to an external headless-rendering service (outside the agent runtime) and to a captcha-resolution workflow; combine proxy rotation, proper headers/cookies, and environment-stored credentials so skills remain authorized.
Use middleware to inspect HTTP responses; when you see JS placeholders or a captcha challenge, forward the URL to an external renderer (Puppeteer/Playwright service or headless-render API) that returns fully rendered HTML and cookies. Use rotating residential proxies, realistic User-Agent and Referer headers, and persist cookies back into the downloader for future requests.
4
Use the OpenClaw Scheduler to enqueue page requests and stop when results end; compute a stable request fingerprint (URL + sorted params) and store it in a durable set (Redis, DB) to deduplicate before fetching; run an Item Pipeline step that validates and filters items (fields, language, length, duplicates) and only emits cleaned items. Keep auth and rate limits in the runtime config and move state to external storage for reliability.
import hashlib, json, redis, requests
r = redis.Redis()
def fingerprint(url, params):
key = url + json.dumps(params, sort_keys=True)
return hashlib.sha1(key.encode()).hexdigest()
def fetch_page(url, params):
fp = fingerprint(url, params)
if r.sadd("fingerprints", fp) == 0:
return [] # <b>//</b> already fetched
resp = requests.get(url, params=params)
return resp.json().get("items", [])
def item_pipeline(item):
if not item.get("title") or len(item.get("summary",""))<30:
return None # <b>//</b> drop
return {"title":item["title"].strip(), "summary":item.get("summary","")}
This prompt helps an AI assistant understand your setup and guide you through the fix step by step, without assuming technical knowledge.
From startups to enterprises and everything in between, see for yourself our incredible impact.
Need a dedicated strategic tech and growth partner? Discover what RapidDev can do for your business! Book a call with our team to schedule a free, no-obligation consultation. Weâll discuss your project and provide a custom quote at no cost.Â