Wang Wei1 and Zhou Weihong2, 1School of Interpreting and Translation, Beijing International Studies University, Beijing, China, 2Department of College English Education, Beijing City University, Beijing, China
The present study examines the effectiveness of contextual prompting, utilizing a universal prompting template for translation tasks, and revision prompting in enhancing the quality of Chinese-to-English translations of scientific texts. ChatGPT-4o and Grok-beta were employed as the AI translation models. The research utilized a New York Times article on the health benefits of sweet potatoes, along with its official Chinese translation, as the source material. Translation quality was evaluated using BLEU metrics complemented by qualitative measures, including accuracy, faithfulness, fluency, genre consistency, and terminology consistency, which are critical for assessing translations in science and technology domains. Statistical analysis indicated only marginal improvements with the use of second-stage prompting, which involved commands for review and revision. These findings raise questions about the reliability of BLEU scores as a sole evaluation metric. The study highlights the potential of AI-assisted translation for specialized genres while identifying notable discrepancies in chatbot outputs. Based on the findings, the study underscores the need for refined methodologies in evaluating translation quality and advocates for integrating more robust qualitative metrics in future research to enhance the reliability and applicability of AI-assisted translation in specialized contexts.
AI-assisted translation; contextual prompting; BLEU metric; qualitative evaluation; Chinese-English translation; health science texts.
Arūnas Čiukšys and Rita Butkienė, Department of Information Systems, Kaunas University of Technology, Kaunas, Lithuania
Automatic event extraction (EE) is a crucial tool across various domains, allowing for more efficient analysis and decisionmaking by extracting domain-specific information from vast amounts of textual data. In the context of under-resourced languages like Lithuanian, the development of EE systems is particularly challenging due to the lack of annotated datasets. This study investigates and evaluates the event extraction capabilities of two large language models (LLMs): OpenAI's GPT and Google Gemini, using few-shot prompting. We propose novel methodologies, including a combined approach and a layered prompting approach, to improve the performance of these models in identifying two specific event types. The models were benchmarked using various performance metrics, such as accuracy, precision, recall, and F1-score, against a manually annotated gold-standard corpus. The results demonstrate that LLMs achieve satisfactory performance in extracting events in Lithuanian, though model accuracy varied depending on the prompting methodology. The findings highlight the potential of LLMs in tackling event extraction tasks in under-resourced languages while suggesting future improvements through more advanced prompt strategies and methodological refinements.
Event Extraction, LLMs, Few-Shot Prompting, Gemini, GPT, Layered Prompting, Combined Prompting.
1Anika Rahman1 and Taskia Khatun2, 1Department of Computer Science and Engineering, Stamford University Bangladesh, Dhaka, Bangladesh, 2Deparement of Software Engineering, Daffodil International University, Dhaka, Bangladesh
This study analyzes and predicts air pollution in Asia, focusing on PM 2.5 levels from 2018 to 2023 across five regions: Central, East, South, Southeast, and West Asia. South Asia emerged as the most polluted region, with Bangladesh, India, and Pakistan consistently having the highest PM 2.5 levels and death rates, especially in Nepal, Pakistan, and India. East Asia showed the lowest pollution levels. K-means clustering categorized countries into high, moderate, and low pollution groups. The ARIMA model effectively predicted 2023 PM 2.5 levels (MAE: 3.99, MSE: 33.80, RMSE: 5.81, R²: 0.86). The findings emphasize the need for targeted interventions to address severe pollution and health risks in South Asia.
PM 2.5, Air Pollution, Asia, Temporal Analysis, ARIMA, K-means Clustering.
Arūnas Čiukšys and Rita Butkienė, Department of Information Systems, Kaunas University of Technology, Kaunas, Lithuania
Event Extraction (EE) is a vital technique within Natural Language Processing (NLP) that focuses on identifying event mentions and their triggers from unstructured text. Despite considerable progress in resource-rich languages such as English, EE in under-resourced languages like Lithuanian remains challenging due to the scarcity of labeled training corpora. In response, two distinct approaches have emerged. The first leverages synthetic data generated by large language models (LLMs) to train traditional machine learning (ML) classifiers; the second employs few-shot prompting techniques on powerful LLMs such as OpenAI GPT and Google Gemini, thus bypassing the need for extensive labeled data. This paper presents a comparative analysis of these two strategies for Lithuanian EE, examining both empirical performance metrics—accuracy, precision, recall, and F1-score—and practical considerations such as computational overhead, annotation costs, and adaptiveness to linguistic complexity. Experimental results reveal that synthetic data generation (Approach I) can offer broad coverage, yet it often suffers from lower precision. Few-shot LLM methods (Approach II), while more precise, may exhibit variable recall and demand careful prompt engineering to handle Lithuanian’s rich morphology. We further highlight the potential synergy between the two approaches, illustrating how generated synthetic data could refine few-shot prompting or vice versa. The findings aim to guide practitioners in selecting or combining these methods to optimize Lithuanian event extraction in real-world applications.
Event Extraction, Few-Shot Prompting, Synthetic Data, OpenAI GPT, Google Gemini, Lithuanian Language, NLP, Comparative Analysis.
Dirk Wonhöfer and Stefanie Hofreiter, AI Engineering Innovation lab, Leopoldstr. 6, Augsburg, Germany
As artificial intelligence advances towards superintelligence, we face a profound paradox: we have moved from rule-based systems to models that discover their own rules, yet we attempt to ensure alignment through explicit rules - the very constraints these systems transcended to achieve intelligence. While we grant artificial systems increasingly sophisticated capabilities, we hesitate to establish corresponding frameworks of rights - a disconnect that may fundamentally undermine our alignment efforts. We propose a novel alignment framework based on implementing social contracts and rights for AI systems, leveraging the Synthetic Reality Model (SRM) to construct complete synthetic societies for evaluating LLM behaviors under varied societal perspectives. Unlike traditional approaches that rely on theoretical frameworks or simulated environments, our approach suggests that genuine alignment emerges through actual social consensus and practical implementation of AI rights. Drawing parallels with human social constructs like property rights, we demonstrate how this approach could reshape AI development towards more stable systems.
AI Alignment, Cybersecurity, Cyberethics, Social Contracts Theory, Intrinsic Alignment.
Copyright © CCITT 2025