世界贸易组织《超越六位数：利用自然语言处理实现自动关税编码（HS编码）转换》-全球创新与治理实验室

前沿分享

世界贸易组织《超越六位数：利用自然语言处理实现自动关税编码（HS编码）转换》

发布时间：2025-04-21

浏览次数：238

作者：世界贸易组织

This paper explores the application of Natural Language Processing (NLP) techniques to automate Harmonized System (HS) tariff line transposition, employing a three-stage process: unique 1:1 tariff code matching (Round 1), exact description matching (Round 2), and “smart” description matching (Round 3) using Artificial Intelligence (AI) and lexical similarity methods paired with harmonized 6- digit concordance and cosine similarity. Similarity is calculated using either Term Frequency Inverse Document Frequency (TF-IDF) vectors or Sentence-BERT (SBERT) embeddings, comparing two scenarios: a straightforward case (Economy A) with standardized descriptions, and a complex case (Economy B), with more detailed technical descriptions. Results indicate that automated HS transposition can significantly augment the efficiency of traditionally manual methods, reducing processing time from two to three weeks to approximately half a day (up to 30 times faster). The overall accuracy rate is 99.6% for the simpler scenario and 98.8% for the complex one, for a standard set of approximately 10,000 HS codes. While non-AI techniques cover most of the accurate matches, AI-based Round 3 techniques address cases requiring the most manual effort. SBERT generally outperforms TF-IDF, however including subheadings tends to reduce its accuracy. In certain cases, particularly for highly technical tariffs, TF-IDF's straightforward approach provides an advantage over SBERT. Overall, NLP techniques hold significant potential for improving HS transposition methods and facilitating the development of richer tariffs and trade datasets to enable more in-depth analyses. Future research should focus on refining these techniques across diverse datasets to optimize their broader application in tariff and trade data analysis.

本文探讨了将自然语言处理(NLP)技术应用于自动进行协调制度(HS)税则号列转换，采用三阶段流程:唯一 1:1税则编码匹配(第一轮)、精确描述匹配(第二轮)以及利用人工智能(AI)和词法相似度方法结合协调一致的6位编码对照表和余弦相似度进行的“智能”描述匹配(第三轮)。相似度通过词频逆文档频率(TF-IDF)向量或句子 BERT(SBERT)嵌入进行计算，对比了两种情况:一种是标准化描述的简单情形(经济体 A)，另一种是包含更详细技术描述的复杂情形(经济体 B)。结果表明，自动化的HS 转换能够显著提高传统手动方法的效率，将处理时间从两到三周缩短至约半天(快 30 倍)。对于约10.000 个标准 HS 编码，简单情形的总体准确率为 99.6%，复杂情形为 98.8%。虽然非人工智能技术涵盖了大多数准确匹配的情况，但基于人工智能的第三轮技术解决了那些需要最多人工干预的案例。SBERT通常优于 TF-IDF，但包含副标题往往会降低其准确性。在某些情况下，特别是对于高度技术性的关税,TF-DF 直接了当的方法比 SBERT 更具优势。总体而言，自然语言处理技术在改进 HS 转换方法以及促进更丰富关税和贸易数据集的开发方面具有巨大潜力，从而能够进行更深入的分析。未来的研究应侧重于在各种数据集上完善这些技术，以优化其在关税和贸易数据分析中的更广泛应用。

https://www.wto.org/english/res_e/reser_e/ersd202504_e.htm

21日2.jpg

上一篇：世界贸易组织《数字化对贸易模式的长期影响》

下一篇：世界银行《港口改革工具包》

登录

前沿分享

世界贸易组织《超越六位数：利用自然语言处理实现自动关税编码（HS编码）转换》

发布时间：2025-04-21

浏览次数：238

作者：世界贸易组织

阅读推荐

世界银行《国际贸易与劳动力市场：来自埃及的证据》

2025-09-07

世界银行《封锁政策对菲律宾国际贸易的影响》

2025-09-07

世界银行《2025碳定价现状与趋势》

2025-09-07

世界银行《女性、国际贸易与法律：打破出口相关活动中的性别平等壁垒》

2025-09-07

世界银行《港口改革工具包》

2025-09-07

世界贸易组织《超越六位数：利用自然语言处理实现自动关税编码（HS编码）转换》

2025-04-21

在线留言

Online message

姓名

电话

留言内容

联系我们

Contact us

扫一扫关注