Extract Sentiment

[1]:

import pandas as pd
import numpy as np
from pandas_survey_toolkit import nlp
# Create sample survey data with open-ended comments
data = {
    'respondent_id': range(1, 11),
    'comments': [
        "I really enjoyed using this product. It exceeded my expectations.",
        "The customer service was terrible and the product broke after a week.",
        "It was okay, nothing special but did the job I needed it to do.",
        "Absolutely love this! Best purchase I've made all year.",
        "I'm disappointed with the quality compared to what was advertised.",
        "It's fine I guess, but I probably wouldn't buy it again.",
        "Fantastic product and great value for money.",
        "This is rubbish. Complete waste of money and time.",
        "I'm neither happy nor unhappy with this purchase.",
        "Well designed and does exactly what it says on the tin!"
    ]
}

# Create DataFrame
df = pd.DataFrame(data)

# Display the original data
print("Original data:")
display(df)

y:\Python Scripts\pandas-survey-toolkit\.venv\Lib\site-packages\sentence_transformers\cross_encoder\CrossEncoder.py:11: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from tqdm.autonotebook import tqdm, trange
y:\Python Scripts\pandas-survey-toolkit\.venv\Lib\site-packages\transformers\utils\generic.py:311: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
  torch.utils._pytree._register_pytree_node(

Original data:

	respondent_id	comments
0	1	I really enjoyed using this product. It exceed...
1	2	The customer service was terrible and the prod...
2	3	It was okay, nothing special but did the job I...
3	4	Absolutely love this! Best purchase I've made ...
4	5	I'm disappointed with the quality compared to ...
5	6	It's fine I guess, but I probably wouldn't buy...
6	7	Fantastic product and great value for money.
7	8	This is rubbish. Complete waste of money and t...
8	9	I'm neither happy nor unhappy with this purchase.
9	10	Well designed and does exactly what it says on...

[2]:

# Extract sentiment from comments
df_with_sentiment = df.extract_sentiment(input_column='comments')

# Display results
print("\nData with sentiment analysis:")
display(df_with_sentiment)

# Summarize sentiment distribution
print("\nSentiment distribution:")
display(df_with_sentiment['sentiment'].value_counts())

# Examine highest positive and negative sentiment scores
print("\nMost positive comments:")
display(df_with_sentiment.sort_values('positive', ascending=False).head(3)[['comments', 'positive', 'neutral', 'negative', 'sentiment']])

print("\nMost negative comments:")
display(df_with_sentiment.sort_values('negative', ascending=False).head(3)[['comments', 'positive', 'neutral', 'negative', 'sentiment']])

y:\Python Scripts\pandas-survey-toolkit\.venv\Lib\site-packages\transformers\modeling_utils.py:484: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  return torch.load(checkpoint_file, map_location=map_location)


Data with sentiment analysis:

	respondent_id	comments	positive	neutral	negative	sentiment
0	1	I really enjoyed using this product. It exceed...	0.988290	0.009409	0.002302	positive
1	2	The customer service was terrible and the prod...	0.002001	0.017636	0.980363	negative
2	3	It was okay, nothing special but did the job I...	0.857669	0.132542	0.009788	positive
3	4	Absolutely love this! Best purchase I've made ...	0.991273	0.007251	0.001476	positive
4	5	I'm disappointed with the quality compared to ...	0.002320	0.030413	0.967267	negative
5	6	It's fine I guess, but I probably wouldn't buy...	0.052894	0.443942	0.503164	negative
6	7	Fantastic product and great value for money.	0.976262	0.020904	0.002834	positive
7	8	This is rubbish. Complete waste of money and t...	0.004194	0.023230	0.972576	negative
8	9	I'm neither happy nor unhappy with this purchase.	0.075750	0.327490	0.596760	negative
9	10	Well designed and does exactly what it says on...	0.758983	0.219506	0.021511	positive


Sentiment distribution:

sentiment
positive    5
negative    5
Name: count, dtype: int64


Most positive comments:

	comments	positive	neutral	negative	sentiment
3	Absolutely love this! Best purchase I've made ...	0.991273	0.007251	0.001476	positive
0	I really enjoyed using this product. It exceed...	0.988290	0.009409	0.002302	positive
6	Fantastic product and great value for money.	0.976262	0.020904	0.002834	positive


Most negative comments:

	comments	positive	neutral	negative	sentiment
1	The customer service was terrible and the prod...	0.002001	0.017636	0.980363	negative
7	This is rubbish. Complete waste of money and t...	0.004194	0.023230	0.972576	negative
4	I'm disappointed with the quality compared to ...	0.002320	0.030413	0.967267	negative

[ ]: