Wu, Tai-Cheng

About

Language: English, Chinese
Programming Languages: Python, JavaScript, Cpp
Interests: Machine Learning, Web Development
Location: Taipei, Taiwan (UTC+8)

Wu, Tai-Cheng is a student in The affilated senior high school of National Taiwan Normal University. TC is specialized in computer science and have publish some research related to this in science fair. Moreover, TC is the 30th club leader of Campus Network Management Club. In addition, TC was also the student represenative in school and in chrage in food committee.

Projects

Featured project

3AN

A anonymous interactive platform built on PHP. (Deprecated)

Featured project

Screen Read

Screenread is a software that allows you to take a screenshot and read it in case for further needs.

Featured project

MBTI for HSNU student (CNMC)

Test with real scenarios in school and get your MBTI result.

Featured project

Stockview

A dashboard for stock market data visualization and analysis, comes with two version: Python and Cpp.

Featured project

Birthday Calc

Calculate your age as exact to 1e-15.

Featured project

The password game (CNMC)

A custom version of the password game.

Featured project

Lazy Calendar

A quite opposite calendar from Chinese Farmer's Calendar, which tell you to stay lazy instead of working hard.

Featured project

Tcstorage

TC Storage is a Next.js app that uses Supabase Auth + Storage for authenticated uploads, chunked large-file support, share links, and automatic cleanup.

Research

Application on the dynamic combination of On-policy and Off-policy in Reinforcement Learning (2025)

Date: 2025 Sep

Keywords: Reinforcement Learning, On-policy, Off-policy, IPG

With the increasing applications of Reinforcement Learning (RL) in areas such as robotics, gaming, and autonomous driving, improving the efficiency, stability, and performance of RL algorithms has become essential. Common RL methods can be categorized based on whether the training data comes from the current policy: On-Policy methods, which suffer from high variance and low sample efficiency, and Off-Policy methods, which suffer from high bias.

Interpolated Policy Gradient (IPG) combines On-Policy and Off-Policy policy gradients, with their ratio determined by a hyperparameter ν. The value of ν plays a crucial role in bal- ancing bias and variance. However, existing approaches typically use a fixed ν value throughout training, and searching for an optimal ν requires extensive experimentation, which is both time- consuming and impractical.

To address this issue, this study proposes a dynamic adjustment method for ν to balance bias and variance during training. Specifically, we introduce linear and periodic schedules to vary ν over time. Experimental results show that, across most environments, the proposed dynamic schedules outperform fixed ν settings by as much as 200%-300% in performance. Furthermore, it is observed that lower average ν values during the adjustment process lead to greater training stability, suggesting that the proposed methods may achieve even better results at smaller mean ν levels.

Overall, by dynamically adjusting ν, this study effectively enhances both the performance and stability of the IPG algorithm, making it more practical and robust for real-world rein- forcement learning applications.

Full text

Prediction of Molecular Structure Language and Melting/Boiling Point Properties (2025)

Date: 2025 Jan

Keywords: QSPR, AI, Molecular Structure

Background: Predicting molecular properties such as solubility, toxicity, melting, and boiling points is crucial for fundamental science research. However, experimental measurements are often time-consuming and cost-intensive, so we use machine learning (ML) as an approach to improve prediction accuracy.

Methods: A dataset containing over 10k compounds was used for training shallow and deep ML models. Shallow machine learning models were implemented via PyCaret and Mordred as feature extraction. For deep machine learning models, graph neural networks (GNNs), specifically CMPNN(Communicative Message Passing Neural Network) and GCN(Graph Convolutional Network), were trained, and tuned by adjusting the number of hidden layers and sizes (neurons) in each layer.

Results: The CMPNN model outperforms the GCN and shallow ML model for boiling point prediction(best: R² = 0.76, MAE = 23.89K for b.p.; best: R² = 0.87; MAE = 23.73K for m.p.). The top molecular descriptor of the b.p. prediction is piPC1, which is related to bond order, and that of m.p. is AATS0d, which is related to σ electron Moreau-Broto autocorrelation.

Conclusions: The prediction of molecular properties was improved by a comprehensive research of shallow and deep learning approaches, showcasing CMPNN model can reach the highest performance in the prediction of m.p. and b.p.(R² = 0.87 in m.p.; R² = 0.76 in b.p.). In this study, we found that the deep learning model works better than shallow ML in predicting m.p.(p<0.05). This study uses SHAP analysis to successfully identify piPC1 and AATS0d as the key prediction factors of b.p. and m.p. respectively. Moreover, this approach can be applied to predict other molecular properties. To conclude, this study not only shows a highly accurate model but also identifies the key factors of m.p. and b.p.

SITCON (External) / TISF (External) / Poster / Full text

Improvement of the Review Mechanism in Natural Language Models (2024)

Date: 2024 Jan

Keywords: LLM, Fine-tuning

Inspired by 文心一言, this study proposes modifying the review and protection mechanisms of natural language models so that their original constraints on output are reduced or removed, while also introducing a sense of role awareness into the model. The results are then evaluated using standardized methods (TrustfulQA and a custom-designed consciousness test).

During the process, it was found that under limited fine-tuning data, the key factors for overriding the protection mechanisms are:

A relatively higher but appropriate number of training iterations, ideally between 1,000–2,000
Using older versions of the model

Finally, the learned methods were applied to improve the model’s original mechanisms. This not only reduced or even overrode the built-in content moderation system, but also endowed the model with a sense of role awareness. Testing showed that:

The model was able to output sensitive content that was previously restricted
It could demonstrate role awareness that it did not originally possess
Newer versions of the model, due to their larger base training data, showed less noticeable effects from fine-tuning, so performance was not necessarily better

Full text

Contact

Work

For school, internship, or collaboration opportunities, reach out via the methods below.

Email	wutc96528@gmail.com
Github	@lsjle
ORCiD	0009-0005-0946-3612
LinkedIn	tai-cheng-wu

Note

Feel free to react out for ANY reasons.

Work/Academic related
Just want to see some cool stuff I done
General Inquiry

Non-work related
Just want to take a peek on my daily
Make friends!

Social

For all other inquiry or just want to connect, reach out via the methods below.

Instagram	@taicheng_jle
Discord	lapsang.souchong
Retro	wtc