2019-01-21

机器学习

11 分钟读完 (大约 1664 个字)

阿里巴巴LTR Qcon分享笔记

LTR, Leaning To Rank，是一种基于机器学习的Rank方法。

SENP : Search Engine Result Page 搜索引擎结果页面

宏观上看，分为了三类，分别是 PointWise，PaireWise，ListWise。

parameter of the classifier should be tuned to optimize the NDCG score on the cross validation set.
query full:SERPs returned in response to a query.
query less:SERPs teturned in response to the user click on some product category.

i.i.d :独立同分布independent and identically distributed

Personalized E-Commerce Search 个性化电子商务搜索

Predict relevance scores and re-rank products returned by an e- commerce search engine on the search engine result page

对搜索引擎结果页面中的item，预测相关性得分和re-rank

使用的数据

Search, browsing, and transaction histories for all users and specifically the user interacting
with the search engine in the current session

所有用户的搜索，浏览，交易历史，特别是在当前搜索引擎session中用户与系统交互的行为

Product properties and meta-data

商品特征和元数据

Data Using 使用的方法

Matchine Learning (e.g. RankSVM, LambdaMart)
Ranking Function(e.g. BM25, Cosine Similarity)

Theory理论 (PAC)

Generalization
Stability

Applications应用

Search 搜索
Recommender System 推荐系统
Question Answering 问答系统
Sentiment Analysis 情感分析，在电商领域，用户评论数据可以用情感分析模型，分析出用户对商品是否满意

Formulation LTR的Formulation

Machine Learning
- Supervised learning with labeled data 使用标记数据进行监督学习，日志分析，点击量，点击停留时间等等
Ranking of objects by subject
- Feature based ranking function 基于特征的排序方法
Approach
- Traditional:BM25 (Probabilistic Model) 概率模型
- New
  - Query and associated products form Group (Train Data) query的结果集称作group
  - Groups are i.i.d group之间是独立同分布
  - Features (query and product) in Group are not i.i.d group中的特征不是独立同分布
  - Model is a function of features 特征产生函数

Issues

Data Labeling 打标数据（训练集）
- Relevance metric (Point) 跟CTR预估有点像，点了或者没点
- Ordered pairs
- Ordered list 对排列组合取最大概率的排列
Feature Extraction 特征提取 (非常非常重要)
- Relevance (User/Query-Prod Feature) 用户的意图（历史行为或者当前的行为）和文档的属性有match
- Semantic (User/Query-Prod Feature) 语义相关性 LDA，现在流行的是 deepLearning CNN
- Importance (Prod Feature) doc本身的重要性，比如sku的各种重要属性，类似PageRank是网页本身重要的特种
Learning Method 学习方法
- Model 模型选择，要结合业务，要最合适的，需要对数据的理解，业务的理解
- Lose Function 损失函数，比如交叉熵。遇到不平滑或者不好求导的的情况，所以很多情况直接选择比较好求导的损失函数
- Optimized Algorithms 优化算法，随机梯度下降
Evaluation Measure 评估测量
- NDCG 当前排序结果和预期排序结果的比值
Machine Learning
- Classification 分类
- Regression 回归
- Ordinal Classification/Regression
Ordinal Regression 序数回归
- Pointwise
  - Transfer ranking to regression 转变ranking为回归问题
  - Ignore group info 忽略group信息
Learning to Rank
- Pairwise
  - Transfer ranking to binary classification 转换ranking为二分类问题
- Listwise
  - Straightforward represent learning 直接了当学习

Pointwise Model用的不多

McRank(2007)
Ordinal Liner Regression （1992）

PaireWise Model比较流行

RankSVM(2000)
- Pairwise classification
IR SVM
- Cost-sensitive Pairwise 排序结果A B C， A比C好很重要，C比A好很差劲
- Using modified hinge loss
RankBoost (2003) GBDT
RankNet (2005) 神经网络
LambdaMart (2008) 最流行，既可以paireWise也可以Listwise

Listwise Model 目标是：正确的那个排序结果的概率最大

Plackett-Luce Model
ListMLE (2008)
ListNet (2007)
- Parameterized Plackett-Luce Model
AdaRank (2007)
PermuRank (2008)
SVM-Map (2007)

Optimize 优化

Direct Optimization 直接优化
- AdaRank
- SVM Map
Approximation 近似
- Soft Rank
- Lambda Rank
Learning Framework 学习框架
- Data Representation 数据表示
- Expected Risk 期望风险
- Empirical Risk 经验风险
- Generalization Analysis 概括分析
Evaluation 总结
- Pairwise approach and Listwise approach perform better than Pointwise
  approach

Applications

Search
- Re-Ranking 重排序
Recommender System
- Collaborating Filter 协同过滤

PERSONALIZED E-COMMERCE SEARCH 个性化电商搜索

Pertinence 如何评价搜索的效果好坏

Log Analysis 通过分析日志分析来评价
Conversion in E-commerce 通过电商转化率评价，但是转化率的评价维度较多。比如通常意义上购买>加购>收藏，但是在双11前夕，用户有提前加购，收藏，等待双11零点下单的习惯，这个场景下，加购或者收藏的价值不低于购买。

Data 数据来源

User info 用户信息：基本信息，行为日志等
List of the terms that forms the query 检索词的term集合
Displayed items and their domains 展示的数据信息，在电商中指商品列表
Items on which the user clicked 用户点击了哪些item
Timing of all of these actions 用户行为的时间轴，比如分析留长等等，用户的行为都是用时间串联起来的
History Behaviors Day 28 to Day 30 近一个月历史的行为数据

Ensemble Model 混合模型

Boosting
Bagging
Stacking

Trap 需要注意的陷阱

Position Bias 偏见
- 人对排在前面的有天然的好感，自然的从上往下读。当用户输入一个比较泛指的词时，说明他的意图不明确（自己都不知道想要什么），比如输入“裙子”，此刻搜索引擎给出的结果也只能是泛泛的，用户点击了某一个item，可能并不是他真正的意图，但至少在页面中，点击了的商品比没点的要好。
- 线上的所有的日志和效果，都是由上一个模型产生的，上一个模型会引导用户的行为，导致产生了噪声数据，所以采集的数据并不是一个shuffle的结果。如果有条件，可以采用分桶实验。

Clicks feedback

When to do personalize

Long Term 长期的偏好：比如一个用户每2个月买一次牙膏
Short Term 短期的偏好：基于当前会话session的

Past interaction timescales 历史交互时间轴数据

Search behaviors timescales 搜索行为时间轴数据

Learing from all repeated results

Features 特征

Aggregate features 聚合特征
Query features 检索特征
User click habits 用户点击习惯
- Number of times the user clicked on the item in the past
Session features Session会话级别的特征
Non-Personalized Rank 非个性化Rank
- Read linearly
- Computed with infomation
Inhibiting/Promoting features
- Query click entropy

Methodology

Classification will be used
- Parameter of the classifier should be tuned optimize the NDCG score on the cross validation set.
Query Full
- SERPs returned in response to a query
Query Less
- SERPs returned in response to the user click

Ensemble Model 混合Model

A very powerful technique to increase accuracy on a variety of ML tasks

Boosting
Bagging

Ensemble Correlation

Voting
Weighing
Averaging
Rank averaging

Mr.Crowley

搜索架构师，微服务架构师

中国北京

文章

58

分类

8

标签

71

关注我

阿里巴巴LTR Qcon分享笔记

分类

标签云

最新文章

归档

标签

最新文章

归档

标签