SynKGen: Un método de sobremuestreo basado en kernel PCA para mejorar la detección de fraudes en tarjetas de crédito

Autores/as

DOI:

https://doi.org/10.51252/rcsi.v5i2.952

Palabras clave:

ADASYN, desbalance de clases, aprendizaje en conjunto, seguridad financiera, Kernel PCA, aprendizaje automático, SMOTE

Resumen

La detección de fraude con tarjeta de crédito es un desafío creciente en el ámbito financiero debido al desequilibrio de datos, donde las transacciones fraudulentas son mínimas en comparación con las legítimas. Este estudio presenta SynKGen, un método de aumentación de datos que utiliza Kernel PCA con perturbaciones gaussianas para generar muestras sintéticas de la clase minoritaria, contrastándolo con ADASYN y SMOTE. Al introducir el análisis de varianzas con perturbaciones controladas en la clase minoritaria, el enfoque propuesto mitiga los riesgos de sobreajuste asociado a las técnicas tradicionales basadas en interpolación. Se evaluaron cuatro clasificadores, XGBoost, RandomForest, AdaBoost y VotingClassifier, utilizando el conjunto de datos original y variantes con aumentación de datos. El clasificador RandomForest alcanzó el mejor desempeño al utilizar datos generados con SynKGen (exactitud: 0,9949, precisión:0,9899) superando a los resultados obtenidos con ADASYN y SMOTE. Los resultados experimentales demuestran que SynKGen mejora la efectividad de la detección de fraudes bancarios en tarjetas de crédito. Estos hallazgos destacan la importancia de estrategias de aumentación de datos para optimizar el rendimiento de los clasificadores en contextos financieros con datos desbalanceados.

Citas

Adil, M., Yinjun, Z., Jamjoom, M. M., & Ullah, Z. (2024). OptDevNet: A Optimized Deep Event-Based Network Framework for Credit Card Fraud Detection. IEEE Access, 12, 132421–132433. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3458944 DOI: https://doi.org/10.1109/ACCESS.2024.3458944

Alatawi, M. N. (2025). Detection of fraud in IoT based credit card collected dataset using machine learning. Machine Learning with Applications, 19, 100603. https://doi.org/10.1016/j.mlwa.2024.100603 DOI: https://doi.org/10.1016/j.mlwa.2024.100603

Alfaiz, N. S., & Fati, S. M. (2022). Enhanced Credit Card Fraud Detection Model Using Machine Learning. Electronics, 11(4), Article 4. https://doi.org/10.3390/electronics11040662 DOI: https://doi.org/10.3390/electronics11040662

Attouri, K., Mansouri, M., Hajji, M., Kouadri, A., Bensmail, A., Bouzrara, K., & Nounou, H. (2024). Improved fault detection based on kernel PCA for monitoring industrial applications. Journal of Process Control, 133, 103143. https://doi.org/10.1016/j.jprocont.2023.103143 DOI: https://doi.org/10.1016/j.jprocont.2023.103143

Becerra-Suarez, F. L., Fernández-Roman, I., & Forero, M. G. (2024). Improvement of Distributed Denial of Service Attack Detection through Machine Learning and Data Processing. Mathematics, 12(9), Article 9. https://doi.org/10.3390/math12091294 DOI: https://doi.org/10.3390/math12091294

Charizanos, G., Demirhan, H., & İçen, D. (2024). An online fuzzy fraud detection framework for credit card transactions. Expert Systems with Applications, 252, 124127. https://doi.org/10.1016/j.eswa.2024.124127 DOI: https://doi.org/10.1016/j.eswa.2024.124127

Chatterjee, P., Das, D., & Rawat, D. B. (2024). Digital twin for credit card fraud detection: Opportunities, challenges, and fraud detection advancements. Future Generation Computer Systems, 158, 410–426. https://doi.org/10.1016/j.future.2024.04.057 DOI: https://doi.org/10.1016/j.future.2024.04.057

Cherif, A., Badhib, A., Ammar, H., Alshehri, S., Kalkatawi, M., & Imine, A. (2023). Credit card fraud detection in the era of disruptive technologies: A systematic review. Journal of King Saud University - Computer and Information Sciences, 35(1), 145–174. https://doi.org/10.1016/j.jksuci.2022.11.008 DOI: https://doi.org/10.1016/j.jksuci.2022.11.008

Coello, K., Zhou, K., Nutalapati, H., & Tiglao, N. M. C. (2023). Performance Analysis of Credit Card Fraud Analysis and Detection Machine Learning Algorithms. 2023 International Symposium on Networks, Computers and Communications (ISNCC), 1–6. https://doi.org/10.1109/ISNCC58260.2023.10323945 DOI: https://doi.org/10.1109/ISNCC58260.2023.10323945

Dastidar, K. G., Caelen, O., & Granitzer, M. (2024). Machine Learning Methods for Credit Card Fraud Detection: A Survey. IEEE Access, 12, 158939–158965. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3487298 DOI: https://doi.org/10.1109/ACCESS.2024.3487298

Hasan, M., Hoque, A., & Le, T. (2023). Big Data-Driven Banking Operations: Opportunities, Challenges, and Data Security Perspectives. FinTech, 2(3), Article 3. https://doi.org/10.3390/fintech2030028 DOI: https://doi.org/10.3390/fintech2030028

Hilal, W., Gadsden, S. A., & Yawney, J. (2022). Financial Fraud: A Review of Anomaly Detection Techniques and Recent Advances. Expert Systems with Applications, 193, 116429. https://doi.org/10.1016/j.eswa.2021.116429 DOI: https://doi.org/10.1016/j.eswa.2021.116429

Ileberi, E., & Sun, Y. (2024). A Hybrid Deep Learning Ensemble Model for Credit Card Fraud Detection. IEEE Access, 12, 175829–175838. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3502542 DOI: https://doi.org/10.1109/ACCESS.2024.3502542

Interbank. (2024). Banca por Internet: Es tiempo de ir por más - Interbank. https://interbank.pe/comunicado

Jain, Y., Tiwari, N., Dubey, S., & Jain, S. (2019). A comparative analysis of various credit card fraud detection techniques. International Journal of Recent Technology and Engineering, 7, 402–407.

Juniper Research. (n.d.). Online Payment Fraud Losses to Exceed $206 Billion Over the Next Five Years | Press. Retrieved December 16, 2024, from https://www.juniperresearch.com/press/online-payment-fraud-losses-to-exceed-206-billion/

Kaggle. (2024). Retrieved December 22, 2024, from https://www.kaggle.com/datasets/bhadramohit/credit-card-fraud-detection

Kaib, M. T. H., Kouadri, A., Harkat, M. F., Bensmail, A., & Mansouri, M. (2025). Data size reduction approach for nonlinear process monitoring refinement using Kernel PCA technique. Expert Systems with Applications, 274, 126975. https://doi.org/10.1016/j.eswa.2025.126975 DOI: https://doi.org/10.1016/j.eswa.2025.126975

Lazcano, A., & Jaramillo-Morán, M. A. (2025). Data preprocessing techniques and neural networks for trended time series forecasting. Applied Soft Computing, 174, 113063. https://doi.org/10.1016/j.asoc.2025.113063 DOI: https://doi.org/10.1016/j.asoc.2025.113063

Le, T.-T.-H., Hwang, Y., Kang, H., & Kim, H. (2024). Robust Credit Card Fraud Detection Based on Efficient Kolmogorov-Arnold Network Models. IEEE Access, 12, 157006–157020. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3485200 DOI: https://doi.org/10.1109/ACCESS.2024.3485200

Mondal, I. A., Haque, Md. E., Hassan, A.-M., & Shatabda, S. (2021). Handling Imbalanced Data for Credit Card Fraud Detection. 2021 24th International Conference on Computer and Information Technology (ICCIT), 1–6. https://doi.org/10.1109/ICCIT54785.2021.9689866 DOI: https://doi.org/10.1109/ICCIT54785.2021.9689866

Ranganatha, H. R., & Syed, A. (2025). Enhancing fraud detection efficiency in mobile transactions through the integration of bidirectional 3d Quasi-Recurrent Neural network and blockchain technologies. Expert Systems with Applications, 260, 125179. https://doi.org/10.1016/j.eswa.2024.125179 DOI: https://doi.org/10.1016/j.eswa.2024.125179

Rb, A., & Kr, S. K. (2021). Credit card fraud detection using artificial neural network. Global Transitions Proceedings, 2(1), 35–41. https://doi.org/10.1016/j.gltp.2021.01.006 DOI: https://doi.org/10.1016/j.gltp.2021.01.006

Sulaiman, S. S., Nadher, I., & Hameed, S. M. (2024). Credit Card Fraud Detection Using Improved Deep Learning Models. Computers, Materials and Continua, 78(1), 1049–1069. https://doi.org/10.32604/cmc.2023.046051 DOI: https://doi.org/10.32604/cmc.2023.046051

Tang, B., & He, H. (2015). KernelADASYN: Kernel based adaptive synthetic data generation for imbalanced learning. 2015 IEEE Congress on Evolutionary Computation (CEC), 664–671. https://doi.org/10.1109/CEC.2015.7256954 DOI: https://doi.org/10.1109/CEC.2015.7256954

Tang, Y., & Liu, Z. (2024). A Credit Card Fraud Detection Algorithm Based on SDT and Federated Learning. IEEE Access, 12, 182547–182560. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3491175 DOI: https://doi.org/10.1109/ACCESS.2024.3491175

Wijaya, M. G., Pinaringgi, M. F., Zakiyyah, A. Y., & Meiliana. (2024). Comparative Analysis of Machine Learning Algorithms and Data Balancing Techniques for Credit Card Fraud Detection. Procedia Computer Science, 245, 677–688. https://doi.org/10.1016/j.procs.2024.10.294 DOI: https://doi.org/10.1016/j.procs.2024.10.294

Yang, Z., Wang, Y., Shi, H., & Qiu, Q. (2024). Leveraging Mixture of Experts and Deep Learning-Based Data Rebalancing to Improve Credit Fraud Detection. Big Data and Cognitive Computing, 8(11), Article 11. https://doi.org/10.3390/bdcc8110151 DOI: https://doi.org/10.3390/bdcc8110151

Zhang, C., Nie, F., & Xiang, S. (2010). A general kernelization framework for learning algorithms based on kernel PCA. Neurocomputing, 73(4), 959–967. https://doi.org/10.1016/j.neucom.2009.08.014 DOI: https://doi.org/10.1016/j.neucom.2009.08.014

Zhao, X., Liu, Y., & Zhao, Q. (2024). Improved LightGBM for Extremely Imbalanced Data and Application to Credit Card Fraud Detection. IEEE Access, 12, 159316–159335. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3487212 DOI: https://doi.org/10.1109/ACCESS.2024.3487212

Descargas

Publicado

2025-07-20

Cómo citar

Becerra-Suarez, F. L., Jiménez-Fernández, L. J., Ticona-Tapia, E. D., Cárdenas-Gonzáles, J. R., & Bustamante-Quintana, P. H. (2025). SynKGen: Un método de sobremuestreo basado en kernel PCA para mejorar la detección de fraudes en tarjetas de crédito . Revista Científica De Sistemas E Informática, 5(2), e952. https://doi.org/10.51252/rcsi.v5i2.952