x4 | 易学教程

[NLP]XLNet论文理解

阅读更多关于 [NLP]XLNet论文理解

论文地址： https://arxiv.org/pdf/1906.08237.pdf 简介 XLNet 是一个类似 BERT 的模型，但是它采用了通用的自回归预训练方法 (AR 模型 ) ，而基于 DAE 的 Bert 模型采用的则是降噪自动编码方法 (AE 模型 ) ， bert 和 AR 模型的区别主要是在以下三方面： 1 、独立假设： bert 中一个序列中会被 mask 掉不超过 15% 的单词，但是在预训练时的目标函数是基于 mask 值彼此独立、互不干扰的假设下进行计算的，而 AR 模型中不需要设置 mask 值，预训练时的目标函数也不必设立独立假设。同时， mask 的设置会导致预训练 - 微调的数据上的不统一，这也是 Bert 的一个缺陷。 2 、输入噪声： Bert 在预训练时对输入序列进行随机 mask ，这是一种输入噪声设定，但是在下游任务进行微调时却并没有对输入序列设置输入噪声，即随机 mask ，这引起了预训练 - 微调间的差异。与之相反， AR 模型不设置输入噪声，因而不会有这种问题。 3 、上下文依赖： AR 模型只考虑前向信息或者后向信息，而 bert 要考虑双向信息，结果就是， bert 的目标函数允许模型被训练为能够更好地抓取双向信息。 XLNet(AR 模型 ) 总体思路 XLNet 提出了一种让 AR 语言模型 ( 单向的输入和预测模式 )

金融风控信用卡评分建模

阅读更多关于金融风控信用卡评分建模

一、引言如何利用机器学习以及大数据技术来降低风险呢？如何建立信用评分的模型呢？本文将针对这些问题简单介绍互金行业中授信产品的风控建模过程，内容主要如下： ·信用风险定义 ·信用风险评分卡类型 ·信用评分模型建立的基本流程 1.信用风险定义 ①风险管理的概念风险管理最早起源于美国。1930年由美国管理协会保险部最先倡导风险管理，后面在全球流行开来，随着互联网的迅猛发展，大数据、数据挖掘和机器学习等新兴技术开始出现，让风险管理更为精准。他们通过收集银行系统本身的征信数据以及用户在互联网上的的各种数据，包括人际关系、历史消费行为、身份特征等，通过大数据“画像”技术，对用户进行全面的定位，由此来预测用户的履约能力、降低信贷风险。 ②什么是信用风险？信用风险又称违约风险，是指借款人、证券发行人或交易对方因种种原因，不愿或无力履行合同条件而构成违约，致使银行、投资者或交易对方遭受损失的可能性。即受信人不能履行还本付息的责任而使授信人的预期收益与实际收益发生偏离的可能性，它是金融风险的主要类型。万事都有风险，但对于金融行业来讲，风险控制尤为重要。对于海量的用户数据处理，传统的人工授信方式显然是很乏力的，因此现在大多互联网金融P2P公司都采用机器学习、大数据等技术对风险进行自动化评估，来最大程度的降低风险。当然，这些技术的应用并不能百分百的保证零风险，因为有很多人为因素是不可控的

How to save a new sheet in an existing excel file, using Pandas?

阅读更多关于 How to save a new sheet in an existing excel file, using Pandas?

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试): 问题: I want to use excel files to store data elaborated with python. My problem is that I can't add sheets to an existing excel file. Here I suggest a sample code to work with in order to reach this issue import pandas as pd import numpy as np path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx" x1 = np.random.randn(100, 2) df1 = pd.DataFrame(x1) x2 = np.random.randn(100, 2) df2 = pd.DataFrame(x2) writer = pd.ExcelWriter(path, engine = 'xlsxwriter') df1.to_excel(writer, sheet_name = 'x1') df2.to_excel(writer, sheet_name = 'x2') writer.save()

Regression of variables in a dataframe

阅读更多关于 Regression of variables in a dataframe

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试): 问题: I have a dataframe: df = data.frame(x1 = rnorm(50), x2 = rnorm(50), x3 = rnorm(50), x4 = rnorm(50)) I would like to regress each variable versus all the other variables, for instance: fit1 <- lm(x1 ~ ., data = df) fit2 <- lm(x2 ~ ., data = df) etc. (Of course, the real dataframe has a lot more variables). I tried putting them in a loop, but it didn't work. I also tried using lapply but couldn't produce the desired result either. Does anyone know the trick? 回答1: You can use reformulate to dynamically build formuals df = data.frame(x1 = rnorm

Python 实现两个矩形重合面积

阅读更多关于 Python 实现两个矩形重合面积

计算两个矩形的重合面积 import math x1, y1, x2, y2 = input().split(" ") x1, y1, x2, y2=int(x1), int(y1), int(x2), int(y2) # print(x1, y1, x2, y2) x1,x2 = min(x1,x2),max(x1,x2) y1,y2= min(y1,y2),max(y1,y2) # print(x1, y1, x2, y2) x3, y3, x4, y4 = input().split(" ") x3, y3, x4, y4 = int(x3), int(y3), int(x4), int(y4) x3,x4 = min(x3,x4),max(x3,x4) y3,y4 = min(y3,y4),max(y3,y4) # print(x3, y3, x4, y4) if (x2<=x3 or x4<=x1) and (y2 <= y3 or y4<=y1): print(0) else: lens = min(x2, x4) - max(x1, x3) wide = min(y2, y4) - max(y1, y3) print(lens*wide) 来源： https://www.cnblogs.com/chenpython123/p/11427275.html

HDU2056 Rectangles

阅读更多关于 HDU2056 Rectangles

Problem Description Given two rectangles and the coordinates of two points on the diagonals of each rectangle,you have to calculate the area of the intersected part of two rectangles. its sides are parallel to OX and OY . Input Input The first line of input is 8 positive numbers which indicate the coordinates of four points that must be on each diagonal.The 8 numbers are x1,y1,x2,y2,x3,y3,x4,y4.That means the two points on the first rectangle are(x1,y1),(x2,y2);the other two points on the second rectangle are (x3,y3),(x4,y4). Output Output For each case output the area of their intersected

统计

阅读更多关于统计

mean 平均数 median 中位数 mode 众数 prctile 四分位数，也称四分位点，是指在统计学中把所有数值由小到大排列并分成四等份，处于三个分割点位置的数值 >> load stockreturns >> x4=stocks(:,4) x4 = 2.9914 2.6204 -0.3624 2.2094 -2.4079 2.9277 -1.0281 -3.3567 1.9482 2.3593 3.5894 -0.8645 -2.1599 -3.8850 -0.3494 -0.8930 1.7893 -0.7141 -1.2013 -3.3592 0.6673 3.7411 2.1308 2.6651 0.9303 -1.6121 0.4302 2.4841 -1.9898 4.5561 2.5903 -0.1640 1.3072 -0.4932 -4.1415 -0.1348 3.2650 -3.1646 -0.6340 -0.7101 -4.3216 0.6100 -1.5280 0.2114 -1.4884 0.9389 1.1603 -0.0880 -2.7591 3.3299 -0.2885 2.2003 -4.3053 0.4916 0.4022 -0.9178 -5.8764 -5.4219 2.9621 -3.1092 2.1127 0.8175 -2