Sistem Otomatis Ringkasan Laporan Keuangan Berbasis PDF Menggunakan Metode NLP Transformer
DOI:
https://doi.org/10.22441/format.2025.v14.i2.009Kata Kunci:
laporan keuangan, peringkasan teks, Python, PDF, NLP TransformerAbstrak
Kompleksitas dan volume laporan keuangan perusahaan yang terus meningkat menjadi tantangan bagi analis dan pemangku kepentingan dalam menginterpretasikan informasi secara cepat dan akurat. Analisis manual cenderung memakan waktu lama dan rentan terhadap kesalahan. Penelitian ini mengusulkan sistem otomatis untuk melakukan peringkasan laporan keuangan berbasis PDF dengan menggunakan metode Natural Language Processing (NLP) berbasis Transformer. Sistem dikembangkan menggunakan Python serta memanfaatkan PyPDF2/pdfplumber untuk ekstraksi teks, NLTK untuk prapemrosesan, dan model BART/T5 dari Hugging Face Transformers untuk menghasilkan ringkasan. Evaluasi dilakukan pada laporan tahunan perusahaan multinasional dengan panjang 50–200 halaman. Hasil pengujian menunjukkan sistem mampu mereduksi teks hingga 10–15% dari panjang asli, dengan nilai rata-rata ROUGE-1 = 0,72; ROUGE-2 = 0,62; dan ROUGE-L = 0,70. Ringkasan yang dihasilkan mempertahankan informasi penting seperti tren pendapatan, laba bersih, beban operasional, dan arus kas. Pendekatan ini dapat mempercepat analisis keuangan, mengurangi beban kognitif analis, serta menghasilkan ringkasan yang konsisten. Ke depan, penelitian dapat dikembangkan dengan fine-tuning model pada korpus keuangan serta integrasi analisis sentimen untuk memperkaya interpretasi manajerial.
Unduhan
Referensi
Y. Dong, H. Zhao, and M. Lapata, “Towards Unified Abstractive Long Document Summarization,” Transactions of the Association for Computational Linguistics (TACL), vol. 10, pp. 1–15, 2022. doi: 10.1162/tacl_a_00449.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” NAACL-HLT, pp. 4171–4186, 2019. doi: 10.48550/arXiv.1810.04805.
Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” arXiv preprint, arXiv:1907.11692, 2019. doi: 10.48550/arXiv.1907.11692.
M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” ACL, pp. 7871–7880, 2020. doi: 10.48550/arXiv.1910.13461.
C. Raffel et al., “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,” JMLR, vol. 21, no. 140, pp. 1–67, 2020. doi: 10.48550/arXiv.1910.10683.
J. Zhang, Y. Zhao, M. Saleh, and P. J. Liu, “PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization,” ICML, pp. 11328–11339, 2020. doi: 10.48550/arXiv.1912.08777.
T. Wolf et al., “Transformers: State-of-the-Art Natural Language Processing,” EMNLP (System Demonstrations), pp. 38–45, 2020. doi: 10.48550/arXiv.1910.03771.
A. Rothe et al., “Leveraging Pre-trained Checkpoints for Sequence Generation Tasks,” TACL, vol. 9, pp. 1130–1143, 2021. doi: 10.1162/tacl_a_00420.
H. Liu, Y. Chen, and X. Li, “Abstractive Summarization of Long Documents Using Machine Learning: Applications in Finance,” IEEE Access, vol. 9, pp. 140120–140133, 2021. doi: 10.1109/ACCESS.2021.3119342.
Y. Yang, J. Gao, and C. Zhang, “Domain-Specific Text Summarization in Financial Reports: A Deep Learning Approach,” Expert Systems with Applications, vol. 168, p. 114129, 2021. doi: 10.1016/j.eswa.2020.114129.
J. Xu, S. Wu, and H. Wang, “A Hybrid Extractive-Abstractive Summarization Approach for Long Financial Documents,” Applied Sciences, vol. 11, no. 23, p. 11240, 2021. doi: 10.3390/app112311240.
Y. Wu, Z. Li, and L. Wang, “Fine-tuning Transformers for Domain-Specific Financial Summarization,” Information Processing & Management, vol. 59, no. 6, p. 103063, 2022. doi: 10.1016/j.ipm.2022.103063.
A. Fabbri, W. Li, J. She, and S. Radev, “Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Benchmark,” ACL, pp. 1907–1920, 2019. doi: 10.48550/arXiv.1906.01749.
A. R. Javed, S. A. Hassan, M. T. Afzal, and T. Baker, “Financial Sentiment Analysis Using Machine Learning Techniques,” Computers, Materials & Continua, vol. 68, no. 2, pp. 1935–1950, 2021. doi: 10.32604/cmc.2021.014565.
Z. Song, R. Zhao, and Y. Wang, “Summarizing Financial Reports with Pre-trained Transformers and Graph Neural Networks,” Knowledge-Based Systems, vol. 235, p. 107621, 2022. doi: 10.1016/j.knosys.2021.107621.
A. Dong, Y. Zhao, and J. Li, “Financial Text Summarization with Pre-trained Transformers: A Case Study on Annual Reports,” IEEE Access, vol. 11, pp. 45420–45433, 2023. doi: 10.1109/ACCESS.2023.3267890.
Unduhan
Diterbitkan
Cara Mengutip
Terbitan
Bagian
Lisensi
The copyright to this article is transferred to Universitas Mercu Buana (UMB) if and when the article is accepted for publication. The undersigned hereby transfers any and all rights in and to the paper including without limitation all copyrights to UMB. The undersigned hereby represents and warrants that the paper is original and that he/she is the author of the paper, except for material that is clearly identified as to its original source, with permission notices from the copyright owners where required. The undersigned represents that he/she has the power and authority to make and execute this assignment.
We declare that this paper has not been published in the same form elsewhere.
Furthermore, I/We hereby transfer the unlimited rights of publication of the above-mentioned paper as a whole to UMB. The copyright transfer covers the right to reproduce and distribute the article, including reprints, translations, photographic reproductions, microform, electronic form (offline, online) or any other reproductions of similar nature.
The corresponding author signs for and accepts responsibility for releasing this material on behalf of any and all co-authors. This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s) where applicable. After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted.
Retained Rights/Terms and Conditions
Although authors are permitted to re-use all or portions of the Work in other works, this does not include granting third-party requests for reprinting, republishing, or other types of re-use.
Our Articles are licensed under CC BY-NC

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.