Wu, Tai-Cheng is a student in The affilated senior high school of National Taiwan Normal University. TC is specialized in computer science and have publish some research related to this in science fair. Moreover, TC is the 30th club leader of Campus Network Management Club. In addition, TC was also the student represenative in school and in chrage in food committee.
Featured project
A anonymous interactive platform built on PHP. (Deprecated)
Featured project
Screenread is a software that allows you to take a screenshot and read it in case for further needs.
Featured project
Test with real scenarios in school and get your MBTI result.
Featured project
A dashboard for stock market data visualization and analysis, comes with two version: Python and Cpp.
With the increasing applications of Reinforcement Learning (RL) in areas such as robotics, gaming, and autonomous driving, improving the efficiency, stability, and performance of RL algorithms has become essential. Common RL methods can be categorized based on whether the training data comes from the current policy: On-Policy methods, which suffer from high variance and low sample efficiency, and Off-Policy methods, which suffer from high bias.
Interpolated Policy Gradient (IPG) combines On-Policy and Off-Policy policy gradients, with their ratio determined by a hyperparameter ν. The value of ν plays a crucial role in bal- ancing bias and variance. However, existing approaches typically use a fixed ν value throughout training, and searching for an optimal ν requires extensive experimentation, which is both time- consuming and impractical.
To address this issue, this study proposes a dynamic adjustment method for ν to balance bias and variance during training. Specifically, we introduce linear and periodic schedules to vary ν over time. Experimental results show that, across most environments, the proposed dynamic schedules outperform fixed ν settings by as much as 200%-300% in performance. Furthermore, it is observed that lower average ν values during the adjustment process lead to greater training stability, suggesting that the proposed methods may achieve even better results at smaller mean ν levels.
Overall, by dynamically adjusting ν, this study effectively enhances both the performance and stability of the IPG algorithm, making it more practical and robust for real-world rein- forcement learning applications.
Background: Predicting molecular properties such as solubility, toxicity, melting, and boiling points is crucial for fundamental science research. However, experimental measurements are often time-consuming and cost-intensive, so we use machine learning (ML) as an approach to improve prediction accuracy.
Methods: A dataset containing over 10k compounds was used for training shallow and deep ML models. Shallow machine learning models were implemented via PyCaret and Mordred as feature extraction. For deep machine learning models, graph neural networks (GNNs), specifically CMPNN(Communicative Message Passing Neural Network) and GCN(Graph Convolutional Network), were trained, and tuned by adjusting the number of hidden layers and sizes (neurons) in each layer.
Results: The CMPNN model outperforms the GCN and shallow ML model for boiling point prediction(best: R² = 0.76, MAE = 23.89K for b.p.; best: R² = 0.87; MAE = 23.73K for m.p.). The top molecular descriptor of the b.p. prediction is piPC1, which is related to bond order, and that of m.p. is AATS0d, which is related to σ electron Moreau-Broto autocorrelation.
Conclusions: The prediction of molecular properties was improved by a comprehensive research of shallow and deep learning approaches, showcasing CMPNN model can reach the highest performance in the prediction of m.p. and b.p.(R² = 0.87 in m.p.; R² = 0.76 in b.p.). In this study, we found that the deep learning model works better than shallow ML in predicting m.p.(p<0.05). This study uses SHAP analysis to successfully identify piPC1 and AATS0d as the key prediction factors of b.p. and m.p. respectively. Moreover, this approach can be applied to predict other molecular properties. To conclude, this study not only shows a highly accurate model but also identifies the key factors of m.p. and b.p.
Inspired by 文心一言, this study proposes modifying the review and protection mechanisms of natural language models so that their original constraints on output are reduced or removed, while also introducing a sense of role awareness into the model. The results are then evaluated using standardized methods (TrustfulQA and a custom-designed consciousness test).
During the process, it was found that under limited fine-tuning data, the key factors for overriding the protection mechanisms are:
Finally, the learned methods were applied to improve the model’s original mechanisms. This not only reduced or even overrode the built-in content moderation system, but also endowed the model with a sense of role awareness. Testing showed that:
For school, internship, or collaboration opportunities, reach out via the methods below.
wutc96528@gmail.com | |
| Github | @lsjle |
| ORCiD | 0009-0005-0946-3612 |
| tai-cheng-wu |
Feel free to react out for ANY reasons.
For all other inquiry or just want to connect, reach out via the methods below.
| @taicheng_jle | |
| Discord | lapsang.souchong |
| Retro | wtc |