Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
f00f378
data_crawl notebook
KLANJRI Jun 22, 2021
b7c08c3
Update README.md
KLANJRI Jul 6, 2021
fbb98c1
QS_wordcloud.img
KLANJRI Jul 6, 2021
fe6cf3a
Update README.md
KLANJRI Jul 6, 2021
6e21674
datasets
KLANJRI Jul 6, 2021
3164722
Create global_df
KLANJRI Jul 6, 2021
0e85850
NLP Analysis
KLANJRI Jul 13, 2021
3cfbec9
Create txt
KLANJRI Jul 13, 2021
9194276
social_network_df
KLANJRI Jul 13, 2021
f449e63
Delete network_df.csv.zip
KLANJRI Jul 13, 2021
d504989
Delete global_df.csv 2.zip
KLANJRI Jul 13, 2021
85bdfae
social_network_gephi_graphs
KLANJRI Jul 13, 2021
6b6467d
Delete txt
KLANJRI Jul 13, 2021
e3de53c
qs_wordcloud.png
KLANJRI Jul 13, 2021
8e2af00
Update README.md
KLANJRI Jul 13, 2021
29038ed
Delete qs_wordcloud.png
KLANJRI Jul 13, 2021
39f1dd4
text_data_analysis
KLANJRI Jul 13, 2021
16ef90a
Create requirements.txt
KLANJRI Jul 22, 2021
56470c1
Delete NLP(1).ipynb
KLANJRI Jul 23, 2021
0e5158b
notebook update
KLANJRI Jul 23, 2021
7cdca3e
add_dashboard
KLANJRI Jul 26, 2021
827c945
update_df
KLANJRI Jul 26, 2021
577379b
Add files via upload
KLANJRI Jul 26, 2021
fccdf08
Create txt
KLANJRI Jul 26, 2021
990cd02
py_app_files
KLANJRI Jul 26, 2021
b9051f7
update
KLANJRI Jul 26, 2021
c6fa23c
update plot
KLANJRI Jul 26, 2021
fd07c29
app_update
KLANJRI Jul 27, 2021
ecb069e
Update README.md
KLANJRI Jul 27, 2021
f7a1b21
word_freq df
KLANJRI Jul 27, 2021
69b106e
Delete words_df.csv
KLANJRI Jul 27, 2021
f0b7df2
Delete words_df_2021.csv
KLANJRI Jul 27, 2021
7153e81
Delete words_df_20_21.csv
KLANJRI Jul 27, 2021
d36ec44
update word_freq df
KLANJRI Jul 27, 2021
73a4625
app_update
KLANJRI Jul 27, 2021
f7125df
Merge remote-tracking branch 'origin/main' into main
KLANJRI Jul 27, 2021
a9fb36c
app_update
KLANJRI Jul 27, 2021
f175d14
Delete words_df.csv
KLANJRI Jul 27, 2021
15cf0e5
Add files via upload
KLANJRI Jul 27, 2021
bd882be
Add files via upload
KLANJRI Jul 27, 2021
c66b3b2
app_update
KLANJRI Jul 27, 2021
e4e6f20
app_update
KLANJRI Jul 27, 2021
079492e
app_update
KLANJRI Jul 27, 2021
df4dffb
Delete plotly_words_timeseries.html
KLANJRI Jul 27, 2021
7dbb123
Delete TOPIC_Model.html
KLANJRI Jul 27, 2021
3a854f2
Delete qs_wordcloud.png
KLANJRI Jul 27, 2021
fe63b4e
data_plots
KLANJRI Jul 27, 2021
ea0e12e
df_LDA
KLANJRI Jul 27, 2021
30d0135
app_update
KLANJRI Jul 27, 2021
cb39e28
Merge remote-tracking branch 'origin/main' into main
KLANJRI Jul 27, 2021
c45b099
app_update
KLANJRI Jul 28, 2021
efeeb50
app_update
KLANJRI Jul 28, 2021
10783f6
app_update
KLANJRI Jul 28, 2021
7b3755b
app_update
KLANJRI Jul 28, 2021
f0f2371
topic_score_chart
KLANJRI Jul 29, 2021
4a0b00e
app_update
KLANJRI Jul 29, 2021
d5263b9
Merge remote-tracking branch 'origin/main' into main
KLANJRI Jul 29, 2021
833e782
app_update
KLANJRI Jul 29, 2021
fe99342
app_update
KLANJRI Jul 29, 2021
6205078
app_update
KLANJRI Jul 29, 2021
e27079d
Add files via upload
KLANJRI Jul 29, 2021
d4174ec
app_update
KLANJRI Jul 29, 2021
d2cae8a
app_update
KLANJRI Jul 29, 2021
729aed3
app_update
KLANJRI Jul 29, 2021
4837f1c
app_update
KLANJRI Jul 30, 2021
81abe37
app_update
KLANJRI Jul 30, 2021
b674520
app_update
KLANJRI Aug 2, 2021
bbfcce4
add lda idea
gedankenstuecke Aug 2, 2021
5134347
test1
KLANJRI Aug 18, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added Data_Viz/ORG_mentioned.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Data_Viz/Products_mentioned.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Data_Viz/Topic_Clustering.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Data_Viz/bigram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Data_Viz/coherence_score_chart.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Data_Viz/colour_comunity_network.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Data_Viz/important_node.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Data_Viz/label_nodes.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Data_Viz/qs_wordcloud.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Data_Viz/topic_model_words.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Data_Viz/trigram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
18,485 changes: 18,485 additions & 0 deletions LDA-test1.ipynb

Large diffs are not rendered by default.

921 changes: 921 additions & 0 deletions LDA.ipynb

Large diffs are not rendered by default.

15,522 changes: 15,522 additions & 0 deletions NLP.ipynb

Large diffs are not rendered by default.

29 changes: 28 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,29 @@
# quantified-self-forums
analyses of the QS forum posts

Self-Quantified is a Community website of self-trackers, self-researchers interested in personal science. It encourages all types of researches to post about their research, ask for advices and explore different methods of self-tracking through wearables.

- Posts by categories with various tags; ‘data’, ‘tools’, ‘diet’, ‘conference’, ‘food’…

<p align-"center">
<img src="./Data_Viz/qs_wordcloud.png" alt="QS wordcloud" width="700">
</p>

## Online Dashboard
If you want to have closer look to the results click [INTERACTIVE DASHBOARD HERE](https://share.streamlit.io/kaoutarlanjri/quantified-self-forums/main/webapp/app.py)

## Expected Results
- The aim of this project is working to understand the community forum by conducting Data Analysis and Natural Language Processing (NLP) of the community’ interactions.
- Providing transparent analysis of the human behaviour in communication and their patterns of networking to improve occurring and future projects in community and personal science.

## Preliminary results
* Clustering PCA Model showing type of user engagement
* LDA MODEL, Topic Modelling showing different posts topics
* Named Entity Recognition, products, organisation, person
* Text Classification
* Network Social Analysis graph

## Technical Environment
Main Platform: Python and CorTexT Platform. PythonLibraries used for pre-processing data:
NLTK, SPACY, GENISM. * Libraries for modelling, data viz plLDAvis, sklearn, matplotlib, wordcloud, plotly, seaborn, requests, beautiful soup
GEPHI-0.9.2 software for network analysis

2,118 changes: 2,118 additions & 0 deletions datasets/df_LDA.csv

Large diffs are not rendered by default.

Binary file added datasets/global_df.zip
Binary file not shown.
Binary file added datasets/network_df.csv.zip
Binary file not shown.
12 changes: 12 additions & 0 deletions datasets/words_df.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
creation_year,data,track,sleep,work,app,device,self,people,help,test,start,health,measure,zeo,post,new,question,rate,user,heart,change,project,sensor,tool,different,activity,idea,apps,record,research
2011,549,312,207,299,397,72,248,227,203,170,146,138,146,29,147,134,158,210,110,78,149,97,66,87,71,61,103,48,81,94
2012,826,401,208,297,573,118,181,227,161,141,167,155,194,24,140,179,109,245,116,73,96,96,32,118,117,89,90,84,52,103
2013,2072,680,792,829,1339,432,335,340,417,216,362,236,325,834,361,315,241,487,215,258,283,178,320,160,208,213,243,163,205,187
2014,1597,837,736,749,1245,404,356,306,353,276,341,419,341,434,344,361,305,456,261,209,170,197,324,222,171,208,229,194,160,200
2015,815,472,247,304,778,177,175,117,172,79,109,244,171,109,107,166,124,257,89,148,106,70,74,91,65,67,82,125,74,93
2016,1080,470,566,360,821,253,133,161,170,211,156,228,197,301,140,160,89,333,114,150,96,101,148,135,97,112,104,123,97,125
2017,426,427,340,223,487,122,121,112,122,162,99,151,130,15,79,70,111,243,50,123,75,66,71,74,82,86,78,72,71,70
2018,426,282,203,202,395,121,78,102,111,170,94,124,127,7,89,68,67,160,66,51,55,61,22,53,73,50,64,44,60,55
2019,729,449,226,290,587,111,169,129,178,142,163,133,213,50,103,127,128,251,110,133,96,132,73,96,75,68,105,63,118,61
2020,909,570,195,356,636,127,257,140,154,138,182,139,252,1,133,127,101,304,140,131,143,155,61,105,125,90,116,83,98,113
2021,293,178,165,153,161,41,58,45,56,36,43,65,84,0,44,40,41,114,26,41,26,31,16,20,33,31,30,13,40,38
82 changes: 82 additions & 0 deletions datasets/words_df_2021.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
creation_date,data,track,sleep,work,self,app,measure,metric,start,question,export,help,efn,amp,heart,rate,health,emfit,experiment,example,device,people,note,analysis,food,learn,dandruff,eat,oura
2021-01-01,0,6,0,0,1,4,0,0,3,1,0,0,0,7,0,0,0,0,0,2,1,0,0,0,8,3,6,23,1
2021-01-02,0,4,0,5,1,8,6,4,2,1,0,1,0,0,0,1,0,0,3,0,0,0,2,1,5,3,6,2,0
2021-01-03,21,1,5,0,5,5,1,1,0,1,0,0,0,3,1,1,0,0,2,2,0,3,0,8,0,1,1,1,0
2021-01-04,10,0,0,7,0,2,0,1,1,1,1,1,0,7,0,0,0,0,1,3,0,1,0,1,0,0,3,1,0
2021-01-05,18,3,3,6,3,3,1,1,1,0,0,3,0,6,0,1,1,0,1,3,0,5,4,2,0,2,3,1,0
2021-01-06,14,2,0,3,3,1,0,0,0,0,0,1,0,3,0,0,0,0,0,3,0,0,0,1,0,0,1,2,0
2021-01-07,0,1,0,1,0,0,1,0,1,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,3,2,0
2021-01-08,15,1,2,1,1,3,4,8,2,1,0,1,0,2,2,0,0,0,2,1,0,1,1,6,0,4,1,4,3
2021-01-09,4,4,0,0,2,1,1,0,1,0,0,1,0,0,0,1,0,0,1,0,0,1,0,0,2,1,2,6,0
2021-01-10,0,2,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,2,0
2021-01-11,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,2,0,0,0,0,1,0,2,0,0,2,2
2021-01-12,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0
2021-01-13,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,1,0,0,2,0,0,0,0,0,1,0,2,2,0
2021-01-14,0,4,0,2,0,0,2,1,0,2,0,0,0,0,0,0,0,0,2,0,0,0,0,0,1,0,2,3,0
2021-01-15,0,2,1,1,0,2,0,0,0,0,0,0,0,2,0,0,0,0,0,1,0,1,0,1,0,0,0,0,0
2021-01-17,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,1,0
2021-01-18,5,0,0,1,1,9,0,0,0,0,0,1,0,0,0,1,1,0,0,0,1,0,1,0,0,1,0,5,0
2021-01-21,2,0,0,0,0,1,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,3,1,1
2021-01-24,1,0,0,0,0,4,2,0,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,1,0,1,0
2021-01-25,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0
2021-01-26,1,0,5,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0
2021-01-27,0,2,7,0,1,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,2
2021-01-28,1,14,17,4,0,12,3,0,0,0,1,0,0,2,1,4,10,1,1,1,2,0,0,0,0,0,0,1,7
2021-01-29,6,3,0,3,0,10,1,0,2,0,1,0,0,0,0,2,0,0,0,0,0,3,0,0,0,0,0,1,0
2021-01-30,0,0,2,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0
2021-01-31,0,2,4,3,0,3,4,0,0,0,1,0,0,1,0,3,1,0,0,0,2,0,0,0,0,1,0,0,0
2021-02-01,1,1,1,0,0,0,4,0,0,0,0,0,0,0,1,1,1,0,0,0,2,0,1,0,0,1,0,3,0
2021-02-03,10,29,4,2,3,5,10,2,4,2,2,8,0,2,0,3,2,0,11,2,0,1,0,3,7,8,21,18,2
2021-02-04,0,3,1,3,0,0,0,0,1,0,0,3,0,0,0,0,0,0,0,0,1,0,0,0,0,2,0,1,0
2021-02-05,8,0,0,3,1,3,0,1,1,4,0,4,0,2,1,2,6,0,0,2,0,1,0,1,1,1,0,6,1
2021-02-06,3,0,0,0,3,0,0,1,2,2,0,1,0,1,0,0,1,0,0,0,1,3,0,0,0,2,0,2,0
2021-02-07,40,19,59,5,0,13,18,17,3,0,13,2,26,56,25,45,8,43,0,3,14,1,26,4,0,5,0,4,48
2021-02-08,27,3,1,3,2,3,0,0,1,2,2,5,0,3,0,2,1,0,5,1,1,2,1,4,6,2,1,12,1
2021-02-09,18,1,4,1,0,4,3,0,1,1,0,3,0,1,1,1,0,1,0,0,2,1,0,1,0,2,0,2,2
2021-02-10,6,3,12,1,2,3,0,0,1,2,0,1,0,1,0,4,1,5,0,0,9,2,0,1,0,0,0,0,5
2021-02-11,1,4,7,0,1,2,2,0,1,3,0,0,0,1,0,0,0,0,1,1,0,1,1,0,0,2,0,3,1
2021-02-13,4,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
2021-02-14,1,0,0,1,0,1,2,0,0,0,0,0,0,0,2,3,2,0,0,0,0,0,0,1,0,0,0,0,0
2021-02-15,0,2,3,1,0,1,0,4,1,0,0,0,13,10,0,0,8,1,1,4,0,0,14,1,0,0,0,4,0
2021-02-16,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2021-02-18,2,0,0,1,0,0,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0
2021-02-19,0,3,1,5,2,2,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0
2021-02-20,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0
2021-02-21,0,1,3,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
2021-02-22,6,2,0,3,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,2,0,3,0
2021-02-23,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2021-02-25,0,1,0,0,2,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0
2021-02-28,1,3,0,1,2,2,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0
2021-03-01,3,0,0,4,0,1,0,0,0,0,0,0,0,0,3,3,0,0,0,0,0,0,0,1,0,0,0,0,0
2021-03-02,0,0,1,2,0,1,0,0,1,0,0,4,0,0,0,1,0,0,0,0,0,0,0,0,2,2,0,3,0
2021-03-03,0,2,0,10,3,0,2,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
2021-03-04,1,4,4,8,3,0,1,0,0,1,0,1,0,1,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0
2021-03-05,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2021-03-06,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0
2021-03-08,0,0,0,7,1,0,10,1,1,2,0,1,0,1,0,1,0,0,0,1,0,1,0,0,0,0,0,4,0
2021-03-09,4,2,0,14,1,1,1,2,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,4,0
2021-03-10,0,1,0,4,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,2,0
2021-03-11,1,1,0,7,5,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0
2021-03-13,0,0,0,1,0,0,0,0,0,0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0
2021-03-14,0,0,0,1,2,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,2,0
2021-03-15,0,4,0,5,0,2,0,1,0,0,0,3,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0
2021-03-16,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2021-03-17,6,5,0,4,0,11,0,0,1,2,3,2,0,2,0,2,6,0,1,0,1,5,0,2,2,0,0,11,0
2021-03-18,0,0,0,0,4,0,0,0,0,0,0,0,0,1,1,2,4,0,2,1,0,0,0,0,5,0,0,0,0
2021-03-19,1,0,0,0,0,2,1,0,0,2,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,0,0,0
2021-03-20,0,0,3,0,0,0,0,0,3,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,2,0
2021-03-22,2,0,0,1,0,1,0,0,0,0,0,0,0,1,1,2,0,2,0,1,1,0,0,0,0,0,0,1,0
2021-03-23,2,3,3,1,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0
2021-03-24,3,4,2,0,0,0,1,0,0,1,0,1,0,0,0,3,0,4,1,0,0,0,0,0,0,0,0,4,0
2021-03-25,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2021-03-26,1,2,0,1,0,6,0,0,0,2,1,0,0,0,0,3,2,0,0,0,0,2,0,0,0,0,0,0,0
2021-03-28,1,1,3,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
2021-03-29,13,6,4,1,0,3,2,1,1,0,1,1,0,1,0,3,5,0,0,1,0,1,0,0,1,1,0,0,0
2021-03-30,10,4,0,2,0,3,0,0,0,0,4,0,0,9,0,3,0,0,0,1,1,1,0,2,0,0,0,1,0
2021-03-31,2,2,0,1,0,1,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2021-04-01,6,7,0,6,2,7,0,0,1,0,3,1,0,18,0,1,0,0,0,5,0,0,0,2,0,0,0,2,0
2021-04-02,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2021-04-03,7,1,1,1,0,3,0,0,0,0,3,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0
2021-04-04,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2021-04-05,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
2021-04-06,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0
Loading