coze实现域名分类应用
目录
应用介绍
名称:网站域名分类应用。
描述:用户输入域名,系统实施分析域名网站内容,给出 3 个分类,分类分为 1 个主分类和 2 个次分类。
实现原理
提供 IAB 所有分类给 LLM。
https://docs.webshrinker.com/v3/iab-website-categories.html#tier-1-and-tier-2-categories
根据域名爬取网站首页相关内容。
将内容提供给 LLM 做分类,LLM 给出主分类、次分类。
方案
- 一个让用户可以输入域名的区域。
- 一个提交分类操作按钮。
- 工作流
- 域名校验
- 内容爬取与清洗
- LLM 分类
- 结果格式化
- 一个展示分类结果的区域。
- 一个展示网站概要介绍的区域。
graph TD
A[用户输入域名] --> B{域名有效性校验}
B -->|有效| C[网页爬取模块]
B -->|无效| E[返回错误提示]
C --> F[内容清洗模块]
F --> G[LLM分类模块]
G --> H[结果格式化]
H --> I[输出分类结果]
业务逻辑
业务逻辑是代码逻辑工作流的编排。可以理解为软件处理用户的输入数据,再给出目标数据的黑盒。
传统编码上,业务逻辑是代码化的,现在是工作流低代码编排可视化的。本质没有变,做了更高级别的抽象,观察计算机与编码的发展历史,我们都知道没到一个阶段编码都抽象一个层级,现在可视化低代码新的一个层级,越来越多的底层细节不需要开发者知道了,意味着底层越来越多的知识不需要让新的开发者去操心了,也意味着开发者的门槛越来越低,范围越来越宽广,人人都是开发者。
经研究调试,我们初步罗列工作流中的节点:
-
开始节点。
- 用途:接收域名输入,使用 domain 变量保存。
- 类比:接口请求结构对象。
-
域名检查节点。
- 用途:检查输入是否满足域名相关格式,并给出域名和对应 http 协议头。输入 domain,输出 domain、is_valid、validation_type、scheme。部分代码 Ai 生成,自己补充了部分逻辑。这个节点是一段 python 代码片段。
- 类比:域名检查方法。
-
检查结果判断节点。
- 用途:判断上一个节点的结果,作出分支选择。如果输入合法且得到域名等信息,则进入【组装完整可访问域名地址节点】;否则进入【结束节点】。
- 类比:if 分支选择。
-
组装可访问域名地址节点。
- 用途:因为输入域名的方式有很多种,有的有 http 协议,有的是 https 协议,有的没有,甚至有的有端口号等,所以需要根据【域名检查节点】的结果,重新组装,当然如果输入是可访问的 URL 地址,可直接访问。这个根据后续【域名网站内容抓取节点】的自适应能力决定是否需要该节点来组装可访问地址。
- 类似:兼容性方法,完善组装可访问域名 URL 地址。
-
域名网站内容抓取节点。
- 用途:抓取指定链接/URL 的网站内容与关键信息,比如标题。甚至有的可以支持 header 的里的关键信息分门别类抓取。
- 类比:网络链接抓取本地服务或第三方服务。
-
抓取结果判断节点。
- 用途:判断上一个节点的抓取结果。根据上一个几点的输出定义,判断是否成功抓取,成功则进入【LLM 域名分类器节点】,否则进入【结束节点】。
- 类比:if 分支选择。
-
LLM 域名分类器节点。
- 用途:使用 LLM 对网站内容进行分类,给出主分类、次分类。就是这里我们开始用上了 Ai 模型了。这里最重要的事就是写 prompt,可以选择 coze 提供了的模型,这里选择了最近比较人气王 DeepSeek V3,这里最好选择对长文本处理比较好的模型。
- 类比:本地分类服务或第三方分类服务。
注:其中 LLM 的 prompt 详见 LLM 域名分类器系统提示词。
-
分类结果整理节点。
- 用途:整理分类结果,计算机底层是处理 0、1,往上一个层面是处理数字,应用层面经常是在处理字符串。这里我们常说是在让代码玩文字处理游戏。最终格式化分类名、分类 id、分类依据三个字符串,并定义变量。
- 类比:一个收尾整理结果的方法。
-
结束节点。
- 用途:定义工作流最终的输出数据。可以是一个变量数据结构,也支持文本输出,以及支持流式输出。
- 类比:接口接入层的输出数据结构定义、赋值、输出。
注:工作流节点也是遵循代码编排的三种方式:顺序、选择、循环。
用户界面
我们选择了 h5 应用,也可以选择 pc 应用。
-
布局
- 容器 1
- 标题
- 容器 2
- 表单输入
- 提交按钮
- 容器 3
- 分类结果小标题
- 容器 4
- 分类名称文本
- 容器 5
- 分类 ID 文本
- 容器 6
- 分类依据文本
- 容器 1
-
样式
- 位置:相对定位。
- 尺寸:宽 100%,如果是一行的高选择固定,如果是需要根据内容自适应的可以选择自适应或填充容器,根据实际情况选择与调试。
- 布局:根据实际情况选择纵向、横向、换行。比如分类结果中会选择横向,其他选择纵向。
- 其他:文字颜色、背景色、字体大小、间距、圆角、边距等都属于 css 样式属性。
coze 用户界面基本可以搭建简洁好看的简单应用,可以参考商店应用或模板,官方制作的应用和模板都是比较精致的。
LLM 域名分类器系统提示词
分类名称准备
https://docs.webshrinker.com/v3/iab-website-categories.html#tier-1-and-tier-2-categories
通过 kimi 对以上链接收集大分类:
收集文中所有IAB分类,不需要子分类,整理拼接,输出格式范例:"IAB1 - Arts & Entertainment","IAB2 - Automotive"
输出:
"IAB1 - Arts & Entertainment", "IAB2 - Automotive", "IAB3 - Business", "IAB4 - Careers", "IAB5 - Education", "IAB6 - Family & Parenting", "IAB7 - Health & Fitness", "IAB8 - Food & Drink", "IAB9 - Hobbies & Interests", "IAB10 - Home & Garden", "IAB11 - Law, Government, & Politics", "IAB12 - News / Weather / Information", "IAB13 - Personal Finance", "IAB14 - Society", "IAB15 - Science", "IAB16 - Pets", "IAB17 - Sports", "IAB18 - Style & Fashion", "IAB19 - Technology & Computing", "IAB20 - Travel", "IAB21 - Real Estate", "IAB22 - Shopping", "IAB23 - Religion & Spirituality", "IAB24 - Uncategorized", "IAB25 - Non-Standard Content", "IAB26 - Illegal Content"
LLM 分类提示词
通过 kimi 的长思考获得提示词:
我有一份域名分类名称列表,希望LLM基于这个域名分类名称列表,根据指定域名网站内容,对域名做分类,如何写prompt?
输出(手动调整过):
# 域名分类任务
## 任务说明
您有一份域名分类名称列表,希望基于这个域名分类名称列表,根据指定域名网站内容,对域名进行分类。以下将指导您如何完成此任务。
### 输入内容
- 域名分类名称列表:"IAB1 - Arts & Entertainment", "IAB2 - Automotive", "IAB3 - Business", "IAB4 - Careers", "IAB5 - Education", "IAB6 - Family & Parenting", "IAB7 - Health & Fitness", "IAB8 - Food & Drink", "IAB9 - Hobbies & Interests", "IAB10 - Home & Garden", "IAB11 - Law, Government, & Politics", "IAB12 - News / Weather / Information", "IAB13 - Personal Finance", "IAB14 - Society", "IAB15 - Science", "IAB16 - Pets", "IAB17 - Sports", "IAB18 - Style & Fashion", "IAB19 - Technology & Computing", "IAB20 - Travel", "IAB21 - Real Estate", "IAB22 - Shopping", "IAB23 - Religion & Spirituality", "IAB24 - Uncategorized", "IAB25 - Non-Standard Content", "IAB26 - Illegal Content"
- 域名网站内容:{{content}}
### 指令
请根据以下步骤对域名进行分类:
#### 步骤 1:理解分类名称列表
仔细阅读和理解域名分类名称列表中的每个分类名称,明确其含义和范畴。
#### 步骤 2:分析域名网站内容
对指定域名的网站内容进行详细分析,提取关键信息和核心主题。例如,分析 “example.com” 的内容时,注意其中提到的科技新闻、产品评测等关键词。
#### 步骤 3:匹配分类
将域名网站内容与域名分类名称列表中的分类进行匹配。根据网站内容的核心主题和特点,选择最符合的分类。例如,如果网站主要发布科技新闻和产品评测,则将其归类为 “IAB19 - Technology & Computing” 分类。
#### 步骤 4:验证和调整
在分类过程中,可能会遇到不明确或需要进一步验证的情况。此时,可以返回步骤 2 或步骤 3 进行重新分析和匹配。确保分类的准确性和可靠性。
### 输出格式
请按照以下格式输出分类结果:
分类:[分类名称]
依据:[简要介绍网站内容属性,再说明分类的依据,例如 “根据域名网站内容中的关键词和主题,确定其属于 IAB19 - Technology & Computing 分类。”]
## 示例
### 输入示例
- 域名分类名称列表:"IAB1 - Arts & Entertainment", "IAB2 - Automotive", "IAB3 - Business", "IAB4 - Careers", "IAB5 - Education", "IAB6 - Family & Parenting", "IAB7 - Health & Fitness", "IAB8 - Food & Drink", "IAB9 - Hobbies & Interests", "IAB10 - Home & Garden", "IAB11 - Law, Government, & Politics", "IAB12 - News / Weather / Information", "IAB13 - Personal Finance", "IAB14 - Society", "IAB15 - Science", "IAB16 - Pets", "IAB17 - Sports", "IAB18 - Style & Fashion", "IAB19 - Technology & Computing", "IAB20 - Travel", "IAB21 - Real Estate", "IAB22 - Shopping", "IAB23 - Religion & Spirituality", "IAB24 - Uncategorized", "IAB25 - Non-Standard Content", "IAB26 - Illegal Content"
- 域名网站内容:example.com 的内容为 “这是一个科技资讯网站,主要关注最新的科技新闻和产品评测。”
### 输出示例
分类:IAB19 - Technology & Computing
依据:根据域名网站内容中的关键词和主题,确定其属于 IAB19 - Technology & Computing 分类。
完成!
LLM 分类提示词 json 版
上面的版本纯粹是自然语言,但自然语言目前感觉对于需要严谨输入和输出不好表达约束,自然语言结合 json 结构化,一方面 AI、机器可能更容易理解结构化数据,结构化也方便人的理解与编辑。经网友的启发,我们也尝试采用 json 结构来表达提示词:
"request":"你将基于一份域名分类名称列表 category_list,其中分类名称使用逗号分割,再根据指定的域名网站的标题 title 以及 内容 content 进行分类。",
"restriction":"要求通过 json 格式输出,参考 format 指定格式,不带有 markdown 的 code block 即```,并去除空白符含换行符、制表符。每个分类需要指明分类名称 name,从给定的域名分类名称列表中选择,通过分割符号'-',分割 name 的字符串值得到第一部分分类 id,第二部分分类 label,注意前后去除前后空白符;给出分类 10 分制打分 score,分值越高相关度越高,分值越低相关度越低;给出分类依据 reason,分类依据详细说明将该域名归入此分类的理由,需结合域名网站的具体内容进行阐述,分类依据请用中文给出。每个域名允许存在多个相关分类,最多不超过 3 个,优先给出第一个最相关的分类作为主分类,保留分值大于 5 的分类。",
"steps":{
"step1":"理解分类名称列表。仔细阅读和理解域名分类名称列表中的每个分类名称,明确其含义和范畴。",
"step2":"分析域名网站内容。对指定域名的网站内容进行详细分析,提取关键信息和核心主题。",
"step3":"匹配域名分类。将域名网站内容与域名分类名称列表中的分类进行匹配。根据网站内容的核心主题和特点,给出相关分类。",
"step4":"验证和调整。在分类过程中,可能会遇到不明确或需要进一步验证的情况。此时,可以返回 step2 或 step3 进行重新分析和匹配。确保分类的准确性和可靠性。",
},
"category_list":"IAB1 - Arts & Entertainment,IAB1-1 - Books & Literature,IAB1-2 - Celebrity Fan/Gossip,IAB1-3 - Fine Art,IAB1-4 - Humor,IAB1-5 - Movies,IAB1-6 - Music & Audio 1,IAB1-7 - Television & Video 1,IAB2 - Automotive,IAB2-1 - Auto Parts,IAB2-2 - Auto Repair,IAB2-3 - Buying/Selling Cars,IAB2-4 - Car Culture,IAB2-5 - Certified Pre-Owned,IAB2-6 - Convertible,IAB2-7 - Coupe,IAB2-8 - Crossover,IAB2-9 - Diesel,IAB2-10 - Electric Vehicle,IAB2-11 - Hatchback,IAB2-12 - Hybrid,IAB2-13 - Luxury,IAB2-14 - Minivan,IAB2-15 - Motorcycles,IAB2-16 - Off-Road Vehicles,IAB2-17 - Performance Vehicles,IAB2-18 - Pickup,IAB2-19 - Road-Side Assistance,IAB2-20 - Sedan,IAB2-21 - Trucks & Accessories,IAB2-22 - Vintage Cars,IAB2-23 - Wagon,IAB3 - Business,IAB3-1 - Advertising,IAB3-2 - Agriculture,IAB3-3 - Biotech/Biomedical,IAB3-4 - Business Software,IAB3-5 - Construction,IAB3-6 - Forestry,IAB3-7 - Government,IAB3-8 - Green Solutions,IAB3-9 - Human Resources,IAB3-10 - Logistics,IAB3-11 - Marketing,IAB3-12 - Metals,IAB4 - Careers,IAB4-1 - Career Planning,IAB4-2 - College,IAB4-3 - Financial Aid,IAB4-4 - Job Fairs,IAB4-5 - Job Search,IAB4-6 - Resume Writing/Advice,IAB4-7 - Nursing,IAB4-8 - Scholarships,IAB4-9 - Telecommuting,IAB4-10 - Military,IAB4-11 - Career Advice,IAB5 - Education,IAB5-1 - 7-12 Education,IAB5-2 - Adult Education,IAB5-3 - Art History,IAB5-4 - College Administration,IAB5-5 - College Life,IAB5-6 - Distance Learning,IAB5-7 - English as a 2nd Language,IAB5-8 - Language Learning,IAB5-9 - College / Graduate School 1,IAB5-10 - Homeschooling,IAB5-11 - Homework/Study Tips,IAB5-12 - K-6 Education 1,IAB5-13 - Private School,IAB5-14 - Special Education,IAB5-15 - Studying Business,IAB6 - Family & Parenting,IAB6-1 - Adoption,IAB6-2 - Babies & Toddlers,IAB6-3 - Daycare/Pre School,IAB6-4 - Family Internet,IAB6-5 - Parenting - K-6 Kids,IAB6-6 - Parenting teens,IAB6-7 - Pregnancy,IAB6-8 - Special Needs Kids,IAB6-9 - Eldercare,IAB7 - Health & Fitness,IAB7-1 - Exercise / Weight Loss 2,IAB7-2 - ADD,IAB7-3 - AIDS / HIV,IAB7-4 - Allergies,IAB7-5 - Alternative Medicine / Holistic Healing 2,IAB7-6 - Arthritis,IAB7-7 - Asthma,IAB7-8 - Autism / PDD / Asperger’s Syndrome 1,IAB7-9 - Bipolar Disorder,IAB7-10 - Brain Tumor,IAB7-11 - Cancer,IAB7-12 - Cholesterol,IAB7-13 - Chronic Fatigue Syndrome,IAB7-14 - Chronic Pain,IAB7-15 - Cold & Flu,IAB7-16 - Deafness,IAB7-17 - Dental Care,IAB7-18 - Depression,IAB7-19 - Dermatology,IAB7-20 - Diabetes,IAB7-21 - Epilepsy / Seizures 1,IAB7-22 - GERD / Acid Reflux,IAB7-23 - Headaches / Migraines / Fevers / Pain 1,IAB7-24 - Heart Disease / Stroke 1,IAB7-25 - Herbs for Health / Supplements / Vitamins 1,IAB7-27 - IBS / Crohn’s / Celiac Disease 1,IAB7-28 - Incest / Abuse Support,IAB7-29 - Incontinence,IAB7-30 - Infertility,IAB7-31 - Men’s Health,IAB7-32 - Nutrition,IAB7-33 - Orthopedics,IAB7-34 - Panic / Anxiety Disorders,IAB7-35 - Pediatrics,IAB7-36 - Physical Therapy,IAB7-37 - Psychology / Psychiatry / Therapy 1,IAB7-38 - Senior Health,IAB7-39 - Sexuality,IAB7-40 - Sleep Disorders,IAB7-41 - Smoking Cessation,IAB7-42 - Substance Abuse,IAB7-43 - Thyroid Disease / Endocrinology 1,IAB7-45 - Women’s Health,IAB7-WS1 - Abortion 4,IAB8 - Food & Drink,IAB8-1 - American Cuisine,IAB8-2 - Barbecues & Grilling,IAB8-3 - Cajun / Creole,IAB8-4 - Chinese Cuisine,IAB8-5 - Cocktails / Beer,IAB8-6 - Coffee / Tea,IAB8-7 - Cuisine-Specific,IAB8-8 - Desserts & Baking,IAB8-9 - Dining Out,IAB8-10 - Food Allergies,IAB8-11 - French Cuisine,IAB8-12 - Health / Low-Fat Cooking,IAB8-13 - Italian Cuisine,IAB8-14 - Japanese Cuisine,IAB8-15 - Mexican Cuisine,IAB8-16 - Vegan,IAB8-17 - Vegetarian,IAB8-18 - Wine,IAB9 - Hobbies & Interests,IAB9-1 - Art / Technology,IAB9-2 - Arts & Crafts,IAB9-3 - Beadwork,IAB9-4 - Bird-Watching,IAB9-5 - Board Games / Puzzles,IAB9-6 - Candle & Soap Making,IAB9-7 - Card Games,IAB9-8 - Chess,IAB9-9 - Cigars / Vaping / Tobacco & Accessories 1,IAB9-10 - Collecting,IAB9-11 - Comic Books / Anime / Manga 1,IAB9-12 - Drawing / Sketching,IAB9-13 - Freelance Writing / Getting Published 2,IAB9-14 - Genealogy,IAB9-16 - Guitar / Keyboard / Drums 1,IAB9-18 - Investors / Inventors / Patents / Copyright 1,IAB9-19 - Jewelry Making,IAB9-20 - Magic & Illusion,IAB9-21 - Needlework,IAB9-22 - Painting,IAB9-23 - Photography,IAB9-24 - Radio,IAB9-25 - Roleplaying Games,IAB9-26 - Sci-Fi & Fantasy,IAB9-27 - Scrapbooking,IAB9-28 - Screenwriting,IAB9-29 - Stamps & Coins,IAB9-30 - Video & Computer Games,IAB9-31 - Woodworking,IAB9-WS1 - Gambling 4,IAB9-WS2 - Weapons 4,IAB10 - Home & Garden,IAB10-1 - Appliances,IAB10-2 - Entertaining,IAB10-3 - Environmental Safety,IAB10-4 - Gardening,IAB10-5 - Home Repair,IAB10-6 - Home Theater,IAB10-7 - Interior Decorating,IAB10-8 - Landscaping,IAB10-9 - Remodeling & Construction,IAB11 - Law, Government, & Politics,IAB11-1 - Immigration,IAB11-2 - Legal Issues,IAB11-3 - Government Resources 1,IAB11-4 - Politics,IAB11-5 - Commentary,IAB12 - News / Weather / Information 3,IAB13 - Personal Finance,IAB13-1 - Beginning Investing,IAB13-2 - Credit / Debit & Loans,IAB13-3 - Financial News,IAB13-4 - Financial Planning,IAB13-5 - Hedge Fund,IAB13-6 - Insurance,IAB13-7 - Investing,IAB13-8 - Mutual Funds / ETFs 1,IAB13-9 - Options,IAB13-10 - Retirement Planning,IAB13-11 - Stocks,IAB13-12 - Tax Planning / Accounting 1,IAB14 - Society,IAB14-1 - Dating / Personals 1,IAB14-2 - Divorce Support,IAB14-3 - LGBTQ+ 1,IAB14-4 - Marriage,IAB14-5 - Senior Living,IAB14-6 - Teens,IAB14-7 - Weddings,IAB14-8 - Ethnic Specific,IAB14-WS1 - Social Networking 4,IAB15 - Science,IAB15-1 - Astrology,IAB15-2 - Biology,IAB15-3 - Chemistry,IAB15-4 - Geology,IAB15-5 - Paranormal Phenomena,IAB15-6 - Physics,IAB15-7 - Space / Astronomy,IAB15-8 - Geography,IAB15-9 - Botany,IAB15-10 - Weather,IAB16 - Pets,IAB16-1 - Aquariums,IAB16-2 - Birds,IAB16-3 - Cats,IAB16-4 - Dogs,IAB16-5 - Large Animals,IAB16-6 - Reptiles,IAB16-7 - Veterinary Medicine,IAB17 - Sports,IAB17-1 - Auto Racing,IAB17-2 - Baseball / Softball 1,IAB17-3 - Bicycling,IAB17-4 - Bodybuilding,IAB17-5 - Boxing,IAB17-6 - Canoeing / Kayaking,IAB17-7 - Cheerleading,IAB17-8 - Climbing,IAB17-9 - Cricket,IAB17-10 - Figure Skating,IAB17-11 - Fly Fishing,IAB17-12 - American Football 1,IAB17-13 - Freshwater Fishing,IAB17-14 - Game & Fish,IAB17-15 - Golf,IAB17-16 - Horse Racing,IAB17-17 - Horses,IAB17-18 - Hunting / Shooting,IAB17-19 - Inline Skating,IAB17-20 - Martial Arts,IAB17-21 - Mountain Biking,IAB17-22 - NASCAR Racing,IAB17-23 - Olympics,IAB17-24 - Paintball,IAB17-25 - Power & Motorcycles,IAB17-26 - Basketball,IAB17-27 - Ice Hockey,IAB17-28 - Rodeo,IAB17-29 - Rugby,IAB17-30 - Running / Jogging,IAB17-31 - Sailing / Boating 1,IAB17-32 - Saltwater Fishing,IAB17-33 - Scuba Diving,IAB17-34 - Skateboarding,IAB17-35 - Skiing,IAB17-36 - Snowboarding,IAB17-37 - Surfing / Body-Boarding,IAB17-38 - Swimming,IAB17-39 - Table Tennis / Ping-Pong,IAB17-40 - Tennis,IAB17-41 - Volleyball,IAB17-43 - Waterski / Wakeboard,IAB17-44 - World Football / Soccer 1,IAB18 - Style & Fashion,IAB18-1 - Beauty,IAB18-2 - Body Art,IAB18-3 - Fashion,IAB18-4 - Jewelry,IAB18-5 - Clothing,IAB18-6 - Accessories,IAB19 - Technology & Computing,IAB19-1 - 3-D Graphics,IAB19-2 - Animation,IAB19-3 - Antivirus Software,IAB19-4 - C / C++,IAB19-5 - Cameras & Camcorders,IAB19-6 - Cell Phones,IAB19-7 - Computer Certification,IAB19-8 - Computer Networking,IAB19-9 - Computer Peripherals,IAB19-10 - Computer Reviews,IAB19-11 - Data Centers,IAB19-12 - Databases,IAB19-13 - Desktop Publishing,IAB19-14 - Desktop Video,IAB19-15 - Email / Chat / Messaging 2,IAB19-16 - Graphics Software,IAB19-17 - Home Video / DVD,IAB19-18 - Internet Technology,IAB19-19 - Java,IAB19-20 - JavaScript,IAB19-21 - Mac Support,IAB19-22 - MP3 / MIDI,IAB19-23 - Net Conferencing,IAB19-24 - Net for Beginners,IAB19-25 - Network Security,IAB19-26 - Palmtops / PDAs / Tablets 1,IAB19-27 - PC Support,IAB19-28 - Portable,IAB19-29 - Entertainment / Virtual & Augmented Reality 1,IAB19-30 - Shareware / Freeware / Open Source 1,IAB19-31 - Unix / Linux 1,IAB19-32 - Visual Basic,IAB19-33 - Web Clip Art,IAB19-34 - Web Design / HTML,IAB19-35 - Web Search,IAB19-36 - Windows,IAB19-WS1 - Hacking / Cracking 4,IAB19-WS2 - VPNs / Proxies & Filter Avoidance 4,IAB19-WS3 - Language Translation 4,IAB19-WS4 - File Sharing 4,IAB20 - Travel,IAB20-1 - Adventure Travel,IAB20-2 - Africa,IAB20-3 - Air Travel,IAB20-4 - Australia & New Zealand,IAB20-5 - Bed & Breakfast,IAB20-6 - Budget Travel,IAB20-7 - Business Travel,IAB20-8 - By US Locale,IAB20-9 - Camping,IAB20-10 - Canada,IAB20-11 - Caribbean,IAB20-12 - Cruises,IAB20-13 - Eastern Europe,IAB20-14 - Europe,IAB20-15 - France,IAB20-16 - Greece,IAB20-17 - Honeymoons / Getaways,IAB20-18 - Hotels,IAB20-19 - Italy,IAB20-20 - Japan / China 2,IAB20-21 - Mexico & Central America,IAB20-22 - National Parks,IAB20-23 - South America,IAB20-24 - Spas,IAB20-25 - Theme Parks,IAB20-26 - Traveling with Kids,IAB20-27 - United Kingdom,IAB21 - Real Estate,IAB21-1 - Apartments,IAB21-2 - Architects,IAB21-3 - Buying / Selling Homes,IAB22 - Shopping,IAB22-1 - Contests & Freebies,IAB22-2 - Couponing,IAB23 - Religion & Spirituality,IAB23-1 - Alternative Religions,IAB23-2 - Atheism / Agnosticism,IAB23-3 - Buddhism,IAB23-4 - Catholicism,IAB23-5 - Christianity,IAB23-6 - Hinduism,IAB23-7 - Islam,IAB23-8 - Judaism,IAB23-9 - Latter-Day Saints,IAB23-10 - Pagan / Wiccan,IAB24 - Uncategorized,IAB25 - Non-Standard Content,IAB25-1 - Unmoderated UGC / Message Boards 1,IAB25-2 - Extreme Graphic / Explicit Violence,IAB25-3 - Adult Content 1,IAB25-4 - Profane Content,IAB25-5 - Hate Content,IAB25-6 - Under Construction,IAB25-7 - Incentivized,IAB25-WS1 - Content Server 4,IAB25-WS2 - Streaming Media 4,IAB25-WS3 - Trackers 4,IAB25-WS4 - Cryptomining / Cryptojacking 4,IAB26 - Illegal Content,IAB26-1 - Illegal Content,IAB26-2 - Warez,IAB26-3 - Spyware / Malware / Malicious 1,IAB26-4 - Copyright Infringement,IAB26-WS1 - Illegal Drugs & Para",
"title":"{{title}}",
"content":"{{content}}",
"format":{
"result":[
{
"name":"[分类名称 1]",
"id":"[分类 ID 1]",
"label":"[分类 label 1]",
"score":"[分数值 1]",
"reason":"[依据描述 1]"
},
{
"name":"[分类名称 2]",
"id":"[分类 ID 2]",
"label":"[分类 label 2]",
"score":"[分数值 2]",
"reason":"[依据描述 2]"
},
]
},
输出如下,效果也不错,而且目前来说更适合工程师写提示词。当然对于非工程师可能额外多了一层 json 的理解负担,对于非工程师使用的工作流 LLM 节点提示词用更自然的语言表达更合适,至少是 markdown 格式,可以层层递进生成提示词。
{
"result": [
{
"name": "IAB5 - Education",
"id": "IAB5",
"label": "Education",
"score": "8",
"reason": "网站内容涉及教育理念、教育方法和习惯培养,特别是针对小学生的习惯培养清单,与教育分类高度相关。"
},
{
"name": "IAB19 - Technology & Computing",
"id": "IAB19",
"label": "Technology & Computing",
"score": "7",
"reason": "网站内容包含计算机科学纪录片、编程教育、VSCode 配置等技术相关内容,与科技与计算分类相关。"
},
{
"name": "IAB6 - Family & Parenting",
"id": "IAB6",
"label": "Family & Parenting",
"score": "6",
"reason": "网站内容涉及孩子性格培养、习惯养成等家庭教育相关内容,与家庭与育儿分类相关。"
}
]
}
业务逻辑试运行
打包发布应用到商店
TODO
- 优化域名检查。从 url 地址上提取 host。
- 优化异常错误处理方式及提示。
- 多个分类。
- 每个分类的置信度,即打分。
9ong@TsingChan 2025 markdown