Embeddings Actuarial Loss Competition
First 20 components of PCA on Sentence Embeddings
@kaggle.louise2001_embeddings_actuarial_loss_competition
First 20 components of PCA on Sentence Embeddings
@kaggle.louise2001_embeddings_actuarial_loss_competition
In the actuarial loss competition, we are provided with text data describing accident context and injury type.
This is the first 20 components of a PCA performed on sentence embeddings of the claim descriptions. It is the concatenation of train and test data.
The embeddings were obtained with paraphrase distil roberta from sentence-transformers : https://github.com/UKPLab/sentence-transformers.
How can these embeddings explain the claim cost, and help predict it ?
CREATE TABLE embeddings_test_20 (
"x_0" DOUBLE,
"x_1" DOUBLE,
"x_2" DOUBLE,
"x_3" DOUBLE,
"x_4" DOUBLE,
"x_5" DOUBLE,
"x_6" DOUBLE,
"x_7" DOUBLE,
"x_8" DOUBLE,
"x_9" DOUBLE,
"x_10" DOUBLE,
"x_11" DOUBLE,
"x_12" DOUBLE,
"x_13" DOUBLE,
"x_14" DOUBLE,
"x_15" DOUBLE,
"x_16" DOUBLE,
"x_17" DOUBLE,
"x_18" DOUBLE,
"x_19" DOUBLE
);CREATE TABLE embeddings_train_20 (
"x_0" DOUBLE,
"x_1" DOUBLE,
"x_2" DOUBLE,
"x_3" DOUBLE,
"x_4" DOUBLE,
"x_5" DOUBLE,
"x_6" DOUBLE,
"x_7" DOUBLE,
"x_8" DOUBLE,
"x_9" DOUBLE,
"x_10" DOUBLE,
"x_11" DOUBLE,
"x_12" DOUBLE,
"x_13" DOUBLE,
"x_14" DOUBLE,
"x_15" DOUBLE,
"x_16" DOUBLE,
"x_17" DOUBLE,
"x_18" DOUBLE,
"x_19" DOUBLE
);Anyone who has the link will be able to view this.