Data Science Club 2019-2020

Data Science Club meeting
(Spring term 2019/20)

1. The first Data Science Club meeting (Spring term 2019/20)

Welcome to join the Data Science Club of this term.

We will meet this Friday (31 Jan): 1:30-2:30pm, G39 Polly Vacher BuildingIn this Friday’s meeting, we will have:

 1) Video lecture: Introduction to reinforcement learning

 https://www.youtube.com/watch?v=4SLGEq_HZxk

2) Tutorial code: python for reinforcement learning

https://www.learndatasci.com/tutorials/reinforcement-q-learning-scratch-python-openai-gym/

3) Competition registration: SemEval 2020 (NLP and image related) If you are interested, please fill in this formhttps://forms.gle/qKKAvNoriq1yy3cf9 by next Monday.

We are calling for forming teams to patriciate SemEval 2020 competition. http://alt.qcri.org/semeval2020/ . You will need to apply data engineering, data mining, machine learning, AI related techniques to analyse data and build models. You will have a chance to work on a team with undergraduates, master students, phd students, and staff members. You will have a chance to build systems and have publications. (It’s a world-known competition. That’s very good for your CV and career). I will provide supervision and suggestions (As the team leader, my team ranked the 3rd out of 34 teams all around the world on sentiment analysis task 2016).

The timeline is as below:

Competition and Beyond:

19 February 2020: Evaluation start*
11 March 2020: Evaluation end*
18 March 2020: Results posted
17 April 2020: System description paper submissions due
24 April 2020: Task description paper submissions due
10 Jun 2020: Author notifications
1 Jul 2020: Camera ready submissions due
13-14 September 2020:  SemEval 2020

 Please check the home page of SemEval 2020 http://alt.qcri.org/semeval2020/index.php?id=tasks  to see more detailed information about the tasks.

Note that like for all clubs of the department, there is no obligation whatsoever, there is *no formal* membership. You can simply attend the meetings, if you find time!

This club is co-organized by phd student Thanet Markchom thanet.mar@gmail.com

If you have questions or suggestions, please contact me or Thanet.

Data Science Club meeting
(Autumn term 2019/20)

1. The third data science club meeting

will be at 12:30-1:30 Friday (15 Nov), G45.
We will talk about neural networks. Here is the referred material. 
 

2. The second data science club meeting

will be from 12:30- 1:30  Friday (25 Oct), G39 Polly Vacher. We will talk about linear model-part 2: classification.  

Please bring your laptop, install python and sklearn.

  • 1) Video: Please watch these two videos at home:

https://www.youtube.com/watch?v=0kns1gXLYg4

https://www.youtube.com/watch?v=F6GSRDoB-Cg

You are required to do: 
(1) data cleaning: remove Twitter Handles (@user)
(2) tf-idf weighting
(3) logistic regression prediction model
(4) evaluate the accuracy of prediction
 
Please do some work at home. We will only have time to go through the video and tutorial very quickly in the class.

3. The first data science club meeting

was at  12:30- 1:30 Friday(7 Oct). We will talk about linear model.

Please bring your laptop, install python and sklearn.

1) Video lecture: Linear Regression With One Variable (8 minutes): https://www.youtube.com/watch?v=kHwlB_j7Hkc

2) The first machine learning example coding tutorial ( linear regression) https://www.kdnuggets.com/2019/03/beginners-guide-linear-regression-python-scikit-learn.html

For the tutorial: please think about what is the:

–          Input, data type, pre-processing, output?

–          classification model? Math formula, loss function?

–          training, validation, test?

–          Evaluation metrics?

–          Which code lines are related to these?

 3. About Data Science Club

In this club, we define Data Science = creative ideas + data + programming (Machine Learning/Artificial Intelligence/Data Mining/Data Visualization). This club is to help students learn and practice data science skills in a more relaxed, flipped, social, and fun way.

We will organize short video lectureshands-on tutorials, and practice workshops. This club will also support students to form teams to participate data science hackathon or competition events. What you will learn will be useful for any free-form hackathon events, finding a Data Scientist job, or creating your own start-ups.

Example events: Kaggle competitions (https://www.kaggle.com/competitions), Australian government open data hackathon ( https://www.govhack.org/ ).

Everyone is welcome. You don’t need any background knowledge to listen to the video lectures. You might need to have programming background (e.g., python, java) to conduct the tutorial session.

The plan is to have meeting once every 2 weeks. The content covers:

1) Theory: linear model, neural networks, reinforcement learning, generative models, …

2) Application: Chat bots, poem writing program, computer music composer, game play agent, …

Note that like for all clubs of the department, there is no obligation whatsoever, there is *no formal* membership. You can simply attend the meetings, if you find time!