Metaheuristic Optimization Algorithms in Artificial Intelligence: A Comprehensive Systematic Review of Neural Architecture Search, Hyperparameter Optimization, and Intelligent Feature Engineering

Authors

https://doi.org/10.48313/maa.v2i3.49

Abstract

The intersection of metaheuristic optimization algorithms and Artificial Intelligence (AI) has emerged as a transformative research frontier, yielding significant advances in the automated design and tuning of Machine Learning (ML) models. This paper presents a comprehensive systematic review, following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, examining 347 peer-reviewed studies published between 2015 and 2025 across five major scholarly databases: Scopus, Web of Science (WoS), IEEE Xplore, ACM Digital Library, and arXiv. The review investigates three critical domains of AI optimization where metaheuristic algorithms have demonstrated exceptional efficacy: 1) Neural Architecture Search (NAS), encompassing convolutional, recurrent, and transformer architecture design, 2) Hyperparameter Optimization (HPO), covering learning rate tuning, batch size selection, regularization parameter calibration, and optimizer configuration, and 3) intelligent feature engineering, including wrapper-based feature selection, feature construction, and dimensionality reduction. Our analysis reveals that evolutionary algorithms (Genetic Algorithms (GAs), Differential Evolution (DE)) and swarm intelligence methods (Particle Swarm Optimization (PSO), Grey Wolf Optimizer (GWO), Whale Optimization Algorithm (WOA)) consistently outperform traditional grid search and random search methods, achieving average accuracy improvements of 2.3%–5.8% while reducing computational cost by 40–75%. Furthermore, hybrid metaheuristic–AI approaches demonstrate synergistic performance gains exceeding those of standalone methods. The review also provides bibliometric analysis, identifies key research trends, highlights methodological challenges—including computational overhead, scalability limitations, and reproducibility concerns—and proposes eight future research directions spanning federated optimization, quantum-inspired metaheuristics, and Large Language Model (LLM) architecture search. This work serves as a comprehensive reference for researchers and practitioners seeking to leverage metaheuristic intelligence for automated AI model optimization.    

Keywords:

Metaheuristic algorithms, Artificial intelligence, Neural architecture search, Hyperparameter optimization, Feature selection

References

  1. [1] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539

  2. [2] Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260. https://doi.org/10.1126/science.aaa8415

  3. [3] Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press Cambridge. https://mitpress.mit.edu/9780262035613/deep-learning/

  4. [4] Talbi, E. G. (2009). Metaheuristics: From design to implementation. Wiley. https://www.wiley.com/en-us/Metaheuristics%3A+From+Design+to+Implementation+-p-9780470278581

  5. [5] Yang, X. S. (2020). Nature-inspired optimization algorithms. Elsevier. https://shop.elsevier.com/books/nature-inspired-optimization-algorithms/yang/978-0-12-821986-7

  6. [6] Boussaïd, I., Lepagnot, J., & Siarry, P. (2013). A survey on optimization metaheuristics. Information sciences, 237, 82–117. https://doi.org/10.1016/j.ins.2013.02.041

  7. [7] Karimi-Mamaghan, M., Mohammadi, M., Meyer, P., Karimi-Mamaghan, A. M., & Talbi, E. G. (2022). Machine learning at the service of meta-heuristics for solving combinatorial optimization problems: A state-of-the-art. European journal of operational research, 296(2), 393–422. https://doi.org/10.1016/j.ejor.2021.04.032

  8. [8] Handoko, S. D., Nguyen, D. T., Yuan, Z., & Lau, H. (2014). Reinforcement learning for adaptive operator selection in memetic search applied to quadratic assignment problem. GECCO comp ’14: Proceedings of the companion publication of the 2014 annual conference on genetic and evolutionary computation (pp. 193–194). ACM Digital Library. https://doi.org/10.1145/2598394.2598451

  9. [9] Dokeroglu, T., Canturk, D., & Kucukyilmaz, T. (2024). A survey on pioneering metaheuristic algorithms between 2019 and 2024. https://doi.org/10.48550/arXiv.2501.14769

  10. [10] Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning. https://doi.org/10.48550/arXiv.1611.01578

  11. [11] Elsken, T., Metzen, J. H., & Hutter, F. (2019). Neural architecture search: A survey. Journal of machine learning research, 20(55), 1–21. http://jmlr.org/papers/v20/18-598.html

  12. [12] Feurer, M., & Hutter, F. (2019). Hyperparameter optimization. In Automated machine learning (pp. 3–33). Springer, Cham. https://doi.org/10.1007/978-3-030-05318-5_1

  13. [13] Yu, T., & Zhu, H. (2020). Hyper-parameter optimization: A review of algorithms and applications. https://doi.org/10.48550/arXiv.2003.05689

  14. [14] Xue, B., Zhang, M., Browne, W. N., & Yao, X. (2016). A survey on evolutionary computation approaches to feature selection. IEEE transactions on evolutionary computation, 20(4), 606–626. https://doi.org/10.1109/TEVC.2015.2504420

  15. [15] Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2017). Feature selection: A data perspective. ACM computing surveys (CSUR), 50(6), 1–45. https://doi.org/10.1145/3136625

  16. [16] Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 249-256). JMLR Workshop and Conference Proceedings. https://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf

  17. [17] Loshchilov, I., & Hutter, F. (2017). SGDR: Stochastic gradient descent with warm restarts. 5th international conference on learning representations (ICLR 2017) (PP. 1-11). OpenReview. https://researchr.org/publication/LoshchilovH17

  18. [18] Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018). Visualizing the loss landscape of neural nets. Advances in neural information processing systems 31 (NeurIPS 2018) (pp. 6389–6399). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2018/hash/a41b3bb3e6b050b6c9067c67f663b915-Abstract.html

  19. [19] Zhou, Z. H. (2012). Ensemble methods: Foundations and algorithms. CRC Press. https://doi.org/10.1201/b12207

  20. [20] Holand, J. H. (1975). Adaptation in natural and artificial systems. The MIT Press. https://mitpress.mit.edu/9780262082136/adaptation-in-natural-and-artificial-systems/

  21. [21] Storn, R., & Price, K. (1997). Differential evolution – A simple and efficient heuristic for global optimization over continuous spaces. Journal of global optimization, 11(4), 341–359. https://doi.org/10.1023/A:1008202821328

  22. [22] Beyer, H. G., & Schwefel, H. P. (2002). Evolution strategies – A comprehensive introduction. Natural computing, 1(1), 3–52. https://doi.org/10.1023/A:1015059928466

  23. [23] Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of ICNN’95 - international conference on neural networks (pp. 1942–1948). IEEE. https://doi.org/10.1109/ICNN.1995.488968

  24. [24] Dorigo, M., Maniezzo, V., & Colorni, A. (1996). Ant system: Optimization by a colony of cooperating agents. IEEE transactions on systems, man, and cybernetics, part b (cybernetics), 26(1), 29–41. https://doi.org/10.1109/3477.484436

  25. [25] Karaboga, D. (2005). An idea based on honey bee swarm for numerical optimization. https://abc.erciyes.edu.tr/pub/tr06_2005.pdf

  26. [26] Mirjalili, S., Mirjalili, S. M., & Lewis, A. (2014). Grey wolf optimizer. Advances in engineering software, 69, 46–61. https://doi.org/10.1016/j.advengsoft.2013.12.007

  27. [27] Mirjalili, S., & Lewis, A. (2016). The whale optimization algorithm. Advances in engineering software, 95, 51–67. https://doi.org/10.1016/j.advengsoft.2016.01.008

  28. [28] Heidari, A. A., Mirjalili, S., Faris, H., Aljarah, I., Mafarja, M., & Chen, H. (2019). Harris hawks optimization: Algorithm and applications. Future generation computer systems, 97, 849–872. https://doi.org/10.1016/j.future.2019.02.028

  29. [29] Mirjalili, S., Gandomi, A. H., Mirjalili, S. Z., Saremi, S., Faris, H., & Mirjalili, S. M. (2017). Salp swarm algorithm: A bio-inspired optimizer for engineering design problems. Advances in engineering software, 114, 163–191. https://doi.org/10.1016/j.advengsoft.2017.07.002

  30. [30] Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671–680. https://doi.org/10.1126/science.220.4598.671

  31. [31] Mirjalili, S. (2016). SCA: A sine cosine algorithm for solving optimization problems. Knowledge-based systems, 96, 120–133. https://doi.org/10.1016/j.knosys.2015.12.022

  32. [32] Mirjalili, S. (2015). Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowledge-based systems, 89, 228–249. https://doi.org/10.1016/j.knosys.2015.07.006

  33. [33] Osaba, E., Del Ser, J., Sadollah, A., Bilbao, M. N., & Camacho, D. (2018). A discrete water cycle algorithm for solving the symmetric and asymmetric traveling salesman problem. Applied soft computing, 71, 277–290. https://doi.org/10.1016/j.asoc.2018.06.047

  34. [34] Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of machine learning research, 13(2), 281–305. http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf

  35. [35] Snoek, J., Larochelle, H., & Adams, R. (2012). Practical Bayesian optimization of machine learning algorithms. Advances in neural information processing systems (Vol. 25, pp. 2951–2959). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2012/file/05311655a15b75fab86956663e1819cd-Paper.pdf

  36. [36] Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential model-based optimization for general algorithm configuration. In Learning and intelligent optimization (pp. 507–523). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-25566-3_40

  37. [37] Miikkulainen, R., & Forrest, S. (2021). A biological perspective on evolutionary computation. Nature machine intelligence, 3(1), 9–15. https://doi.org/10.1038/s42256-020-00278-8

  38. [38] Fister Jr, I., Yang, X.-S., Fister, I., Brest, J., & Fister, D. (2013). A brief review of nature-inspired algorithms for optimization. Elektrotehniski vestnik, 80(3), 116–122. https://www.researchgate.net/publication/249645112

  39. [39] Abdel-Basset, M., Abdel-Fatah, L., & Sangaiah, A. K. (2018). Metaheuristic algorithms: A comprehensive review. In Computational intelligence for multimedia big data on the cloud with engineering applications (pp. 185–231). Academic Press. https://doi.org/10.1016/B978-0-12-813314-9.00010-4

  40. [40] Goldberg, D. E. (1989). Genetic algorithms in search, optimization & machine learning. Addison-Wesley. https://www.amazon.fr/Algorithms-Optimization-Learning-Goldberg-published/dp/B00E31KI3G

  41. [41] Xie, L., & Yuille, A. (2017). Genetic CNN. Proceedings of the IEEE international conference on computer vision (ICCV 2017) (pp. 1379–1388). IEEE. https://doi.org/10.1109/ICCV.2017.154

  42. [42] Das, S., & Suganthan, P. N. (2011). Differential evolution: A survey of the state-of-the-art. IEEE transactions on evolutionary computation, 15(1), 4–31. https://doi.org/10.1109/TEVC.2010.2059031

  43. [43] Mafarja, M., Aljarah, I., Faris, H., Hammouri, A. I., Al-Zoubi, A. M., & Mirjalili, S. (2019). Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert systems with applications, 117, 267–286. https://doi.org/10.1016/j.eswa.2018.09.015

  44. [44] Bischl, B., Richter, J., Becker, M., Binder, M., & Pielok, T. (2023). Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges. WIREs data mining and knowledge discovery, 13(2), 1–43. https://doi.org/10.1002/widm.1484

  45. [45] White, C., Neiswanger, W., & Savani, Y. (2021). BANANAS: Bayesian optimization with neural architectures for neural architecture search. Proceedings of the AAAI conference on artificial intelligence, (Vol. 35, No. 12, PP. 10293-10301). https://doi.org/10.1609/aaai.v35i12.17233

  46. [46] Liu, H., Simonyan, K., & Yang, Y. (2018). Darts: Differentiable architecture search. https://doi.org/10.48550/arXiv.1806.09055

  47. [47] Pham, H., Guan, M., Zoph, B., Le, Q., & Dean, J. (2018). Efficient neural architecture search via parameters sharing. Proceedings of the 35th international conference on machine learning (pp. 4095–4104). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v80/pham18a.html

  48. [48] Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. (2018). Learning transferable architectures for scalable image recognition. 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 8697–8710). IEEE. https://doi.org/10.1109/CVPR.2018.00907

  49. [49] Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3, 1157–1182. https://www.jmlr.org/papers/v3/guyon03a.html

  50. [50] Eiben, A. E., & Smith, J. E. (2015). Introduction to evolutionary computing. Springer Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44874-8

  51. [51] Črepinšek, M., Liu, S. H., & Mernik, M. (2013). Exploration and exploitation in evolutionary algorithms: A survey. ACM computing surveys (CSUR), 45(3), 1–33. https://doi.org/10.1145/2480741.2480752

  52. [52] Alba, E., & Dorronsoro, B. (2008). Cellular genetic algorithms. Springer New York, NY. https://doi.org/10.1007/978-0-387-77610-1

  53. [53] Morales-Castañeda, B., Zaldívar, D., Cuevas, E., Fausto, F., & Rodríguez, A. (2020). A better balance in metaheuristic algorithms: Does it exist? Swarm and evolutionary computation, 54, 100671. https://doi.org/10.1016/j.swevo.2020.100671

  54. [54] White, C., Zela, A., Ru, R., Liu, Y., & Hutter, F. (2021). How powerful are performance predictors in neural architecture search? Advances in neural information processing systems (pp. 28454–28469). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2021/hash/ef575e8837d065a1683c022d2077d342-Abstract.html

  55. [55] Karafotias, G., Hoogendoorn, M., & Eiben, A. E. (2015). Parameter control in evolutionary algorithms: Trends and challenges. IEEE transactions on evolutionary computation, 19(2), 167–187. https://doi.org/10.1109/TEVC.2014.2308294

  56. [56] Salmani Pour Avval, S., Eskue, N. D., Groves, R. M., & Yaghoubi, V. (2025). Systematic review on neural architecture search. Artificial intelligence review, 58(3), 73. https://doi.org/10.1007/s10462-024-11058-w

  57. [57] Baker, B., Gupta, O., Raskar, R., & Naik, N. (2017). Accelerating neural architecture search using performance prediction. https://doi.org/10.48550/arXiv.1705.10823

  58. [58] Li, L., & Talwalkar, A. (2020). Random search and reproducibility for neural architecture search. Proceedings of the 35th uncertainty in artificial intelligence conference (pp. 367–377). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v115/li20c.html

  59. [59] Liu, Y., Sun, Y., Xue, B., Zhang, M., Yen, G. G., & Tan, K. C. (2023). A survey on evolutionary neural architecture search. IEEE transactions on neural networks and learning systems, 34(2), 550–570. https://doi.org/10.1109/TNNLS.2021.3100554

  60. [60] Real, E., Aggarwal, A., Huang, Y., & Le, Q. V. (2019). Regularized evolution for image classifier architecture search. Proceedings of the AAAI conference on artificial intelligence (pp. 4780–4789). AAAI Press. https://doi.org/10.1609/aaai.v33i01.33014780

  61. [61] Ren, P., Xiao, Y., Chang, X., Huang, P. Y., Li, Z., Chen, X., & Wang, X. (2021). A comprehensive survey of neural architecture search: Challenges and solutions. ACM computing surveys (CSUR), 54(4), 1–34. https://doi.org/10.1145/3447582

  62. [62] Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., … ., & Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. British medical journal, 372. https://doi.org/10.1136/bmj.n71

  63. [63] Real, E., Liang, C., So, D., & Le, Q. (2020). AutoML-zero: Evolving machine learning algorithms from scratch. Proceedings of the 37th international conference on machine learning (pp. 8007–8019). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v119/real20a.html

  64. [64] Sun, Y., Xue, B., Zhang, M., & Yen, G. G. (2020). Evolving deep convolutional neural networks for image classification. IEEE transactions on evolutionary computation, 24(2), 394–407. https://doi.org/10.1109/TEVC.2019.2916183

  65. [65] Liu, H., Simonyan, K., Vinyals, O., Fernando, C., & Kavukcuoglu, K. (2017). Hierarchical representations for efficient architecture search. https://doi.org/10.48550/arXiv.1711.00436

  66. [66] Junior, F. E. F., & Yen, G. G. (2019). Particle swarm optimization of deep neural networks architectures for image classification. Swarm and evolutionary computation, 49, 62–74. https://doi.org/10.1016/j.swevo.2019.05.010

  67. [67] Brodzicki, A., Piekarski, M., & Jaworek-Korjakowska, J. (2021). The whale optimization algorithm approach for deep neural networks. Sensors, 21(23), 8003. https://doi.org/10.3390/s21238003

  68. [68] Singh, T., Solanki, A., Sharma, S. K., Jhanjhi, N. Z., & Ghoniem, R. M. (2023). Grey wolf optimization-based CNN-LSTM network for the prediction of energy consumption in smart home environment. IEEE access, 11, 114917-114935. https://doi.org/10.1109/ACCESS.2023.3311751

  69. [69] Lu, Z., Whalen, I., Boddeti, V., Dhebar, Y., Deb, K., Goodman, E., & Banzhaf, W. (2019). NSGA-net: Neural architecture search using multi-objective genetic algorithm. Proceedings of the genetic and evolutionary computation conference (pp. 419–427). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3321707.3321729

  70. [70] Lu, Z., Whalen, I., Dhebar, Y., Deb, K., Goodman, E. D., Banzhaf, W., & Boddeti, V. N. (2021). Multiobjective evolutionary design of deep convolutional neural networks for image classification. IEEE transactions on evolutionary computation, 25(2), 277–291. https://doi.org/10.1109/TEVC.2020.3024708

  71. [71] Gu, H., Wang, H., & Jin, Y. (2022). Surrogate-assisted differential evolution with adaptive multi-subspace search for large-scale expensive optimization. IEEE transactions on evolutionary computation, 27(6), 1765 - 1779. https://doi.org/10.1109/TEVC.2022.3226837

  72. [72] Ghosh, A., Jana, N. D., & Ghosh, S. (2025). Automated CNN architecture design with enhanced particle swarm optimization. Journal of heuristics, 31(4), 35. https://doi.org/10.1007/s10732-025-09570-5

  73. [73] Faramarzi, A., Heidarinejad, M., Mirjalili, S., & Gandomi, A. H. (2020). Marine predators algorithm: A nature-inspired metaheuristic. Expert systems with applications, 152, 113377. https://doi.org/10.1016/j.eswa.2020.113377

  74. [74] Franceschi, L., Donini, M., Perrone, V., Klein, A., Archambeau, C., Seeger, M., … ., & Frasconi, P. (2025). Hyperparameter optimization in machine learning. Foundations and trends in machine learning, 18(6), 975–1109. https://doi.org/10.1561/2200000088

  75. [75] Lorenzo, P. R., Nalepa, J., Kawulok, M., Ramos, L. S., & Pastor, J. R. (2017). Particle swarm optimization for hyper-parameter selection in deep neural networks. Proceedings of the genetic and evolutionary computation conference (pp. 481–488). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3071178.3071208

  76. [76] Xue, B., Zhang, M., & Browne, W. N. (2014). Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms. Applied soft computing, 18, 261–276. https://doi.org/10.1016/j.asoc.2013.09.018

  77. [77] Ibrahim, M. Q., Hussein, N. K., Guinovart, D., & Qaraad, M. (2025). Optimizing convolutional neural networks: A comprehensive review of hyperparameter tuning through metaheuristic algorithms. Archives of computational methods in engineering, 32(8), 5123–5160. https://doi.org/10.1007/s11831-025-10292-x

  78. [78] Emary, E., Zawbaa, H. M., & Hassanien, A. E. (2016). Binary grey wolf optimization approaches for feature selection. Neurocomputing, 172, 371–381. https://doi.org/10.1016/j.neucom.2015.06.083

  79. [79] Albelwi, S., & Mahmood, A. (2017). A framework for designing the architectures of deep convolutional neural networks. Entropy, 19(6), 1–20. https://doi.org/10.3390/e19060242

  80. [80] Chen, K., & Xie, J. (2025). Hybrid adaptive Wolf-Particle swarm optimization algorithm and its application in CNN neural network hyperparameters optimization. Discover computing, 28(1), 319. https://doi.org/10.1007/s10791-025-09878-7

  81. [81] Al-Tashi, Q., Abdulkadir, S. J., Rais, H. M., Mirjalili, S., & Alhussian, H. (2020). Binary optimization using hybrid grey wolf optimization for feature selection. IEEE access, 7(1), 39496-39508. https://doi.org/10.1109/ACCESS.2019.2906757

  82. [82] Ibrahim, R. A., Elaziz, M. A., & Lu, S. (2018). Chaotic opposition-based grey-wolf optimization algorithm based on differential evolution and disruption operator for global optimization. Expert systems with applications, 108, 1–27. https://doi.org/10.1016/j.eswa.2018.04.028

  83. [83] Yang, X. S. (2010). A new metaheuristic bat-inspired algorithm. In Nature inspired cooperative strategies for optimization (NICSO 2010) (pp. 65–74). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-12538-6_6

  84. [84] Coppola, C., Papa, L., Boresta, M., Amerini, I., & Palagi, L. (2024). Tuning parameters of deep neural network training algorithms pays off: A computational study. Transactions in operations research (TOP), 32(3), 579–620. https://doi.org/10.1007/s11750-024-00683-x

  85. [85] Probst, P., Boulesteix, A. L., & Bischl, B. (2019). Tunability: Importance of hyperparameters of machine learning algorithms. Journal of machine learning research, 20(53), 1–32. http://jmlr.org/papers/v20/18-444.html

  86. [86] Golovin, D., Solnik, B., Moitra, S., Kochanski, G., Karro, J., & Sculley, D. (2017). Google vizier: A service for black-box optimization. Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’17) (pp. 1487–1495). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3097983.3098043

  87. [87] Bischl, B., Casalicchio, G., Feurer, M., Gijsbers, P., Hutter, F., Lang, M., ... & Vanschoren, J. (2017). Openml benchmarking suites. https://doi.org/10.48550/arXiv.1708.03731

  88. [88] Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & electrical engineering, 40(1), 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024

  89. [89] Kaur, A., Chhabbra, A., & Shivani. (2024). A comprehensive review of feature selection techniques with metaheuristic algorithms (2019–2024). International conference on information and communication technology for competitive strategies (pp. 401-417). Singapore: Springer Nature Singapore. https://doi.org/10.1007/978-981-96-4142-0_34

  90. [90] Hussain, K., Mohd Salleh, M. N., Cheng, S., & Shi, Y. (2019). Metaheuristic research: A comprehensive survey. Artificial intelligence review, 52(4), 2191–2233. https://doi.org/10.1007/s10462-017-9605-z

  91. [91] Nguyen, B. H., Xue, B., & Zhang, M. (2020). A survey on swarm intelligence approaches to feature selection in data mining. Swarm and evolutionary computation, 54, 100663. https://doi.org/10.1016/j.swevo.2020.100663

  92. [92] Mafarja, M., & Mirjalili, S. (2018). Whale optimization approaches for wrapper feature selection. Applied soft computing, 62, 441–453. https://doi.org/10.1016/j.asoc.2017.11.006

  93. [93] Xue, B., Zhang, M., & Browne, W. N. (2013). Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE transactions on cybernetics, 43(6), 1656–1671. https://doi.org/10.1109/TSMCB.2012.2227469

  94. [94] Too, J., & Mirjalili, S. (2021). A hyper learning binary dragonfly algorithm for feature selection: A COVID-19 case study. Knowledge-based systems, 212, 106553. https://doi.org/10.1016/j.knosys.2020.106553

  95. [95] Siedlecki, W., & Sklansky, J. (1989). A note on genetic algorithms for large-scale feature selection. Pattern recognition letters, 10(5), 335–347. https://doi.org/10.1016/0167-8655(89)90037-8

  96. [96] Faris, H., Mafarja, M. M., Heidari, A. A., Aljarah, I., Al-Zoubi, A. M., Mirjalili, S., & Fujita, H. (2018). An efficient binary Salp Swarm algorithm with crossover scheme for feature selection problems. Knowledge-based systems, 154, 43–67. https://doi.org/10.1016/j.knosys.2018.05.009

  97. [97] Cui, X., Luo, Q., Zhou, Y., Deng, W., & Yin, S. (2022). Quantum-inspired moth-flame optimizer with enhanced local search strategy for cluster analysis. Frontiers in bioengineering and biotechnology, 10, 908356. https://doi.org/10.3389/fbioe.2022.908356

  98. [98] Nenavath, H., & Jatoth, R. K. (2018). Hybridizing sine Cosine algorithm with differential evolution for global optimization and object tracking. Applied soft computing, 62, 1019–1043. https://doi.org/10.1016/j.asoc.2017.09.039

  99. [99] Abd Elaziz, M., Ewees, A. A., Yousri, D., Abualigah, L., & Al-qaness, M. A. A. (2022). Modified marine predators algorithm for feature selection: Case study metabolomics. Knowledge and information systems, 64(1), 261–287. https://doi.org/10.1007/s10115-021-01641-w

  100. [100] Hancer, E., Xue, B., & Zhang, M. (2018). Differential evolution for filter feature selection based on information theory and feature ranking. Knowledge-based systems, 140, 103–119. https://doi.org/10.1016/j.knosys.2017.10.028

  101. [101] Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for Cancer classification using support vector machines. Machine learning, 46(1), 389–422. https://doi.org/10.1023/A:1012487302797

  102. [102] Mirjalili, S. (2019). Genetic algorithm. In Evolutionary algorithms and neural networks (pp. 43-55). Springer International Publishing. https://www.springerprofessional.de/en/genetic-algorithm/15882800

  103. [103] Koza, J. R. (1992). Genetic programming on the programming of computers by means of natural selection. MIT Press. https://mitpress.mit.edu/9780262527910/genetic-programming/

  104. [104] Tran, B., Xue, B., & Zhang, M. (2019). Variable-length particle swarm optimization for feature selection on high-dimensional classification. IEEE transactions on evolutionary computation, 23(3), 473–487. https://doi.org/10.1109/TEVC.2018.2869405

  105. [105] Ruder, S. (2016). An overview of gradient descent optimization algorithms. https://doi.org/10.48550/arXiv.1609.04747

  106. [106] Ding, S., Li, H., Su, C., Yu, J., & Jin, F. (2013). Evolutionary artificial neural networks: A review. Artificial intelligence review, 39(3), 251–260. https://doi.org/10.1007/s10462-011-9270-6

  107. [107] Smith, L. N. (2017). Cyclical learning rates for training neural networks. 2017 IEEE winter conference on applications of computer vision (WACV) (pp. 464–472). IEEE. https://doi.org/10.1109/WACV.2017.58

  108. [108] Liang, J., Meyerson, E., Hodjat, B., Fink, D., Mutch, K., & Miikkulainen, R. (2019). Evolutionary neural autoML for deep learning. Proceedings of the genetic and evolutionary computation conference (pp. 401–409). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3321707.3321721

  109. [109] Such, F. P., Madhavan, V., Conti, E., Lehman, J., Stanley, K. O., & Clune, J. (2017). Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. https://doi.org/10.48550/arXiv.1712.06567

  110. [110] de Campos Souza, P. V., & Sayyadzadeh, I. (2025). GWO-FNN: Fuzzy neural network optimized via grey wolf optimization. Mathematics, 13(7), 1–48. https://doi.org/10.3390/math13071156

  111. [111] Ingber, L. (1993). Simulated annealing: Practice versus theory. Mathematical and computer modelling, 18(11), 29–57. https://doi.org/10.1016/0895-7177(93)90204-C

  112. [112] Aljarah, I., Faris, H., & Mirjalili, S. (2018). Optimizing connection weights in neural networks using the whale optimization algorithm. Soft computing, 22(1), 1–15. https://doi.org/10.1007/s00500-016-2442-1

  113. [113] Stanley, K. O., Clune, J., Lehman, J., & Miikkulainen, R. (2019). Designing neural networks through neuroevolution. Nature machine intelligence, 1(1), 24–35. https://doi.org/10.1038/s42256-018-0006-z

  114. [114] Kuncheva, L. I. (2014). Combining pattern classifiers: Methods and algorithms. Wiley Online Library. https://doi.org/10.1002/9781118914564

  115. [115] Brown, G. (2011). Ensemble learning. In Encyclopedia of machine learning (pp. 312–320). Springer. https://doi.org/10.1007/978-0-387-30164-8_252

  116. [116] Zhou, Z. H., Wu, J., & Tang, W. (2002). Ensembling neural networks: Many could be better than all. Artificial intelligence, 137(1), 239–263. https://doi.org/10.1016/S0004-3702(02)00190-X

  117. [117] Oliveira, L. S., Sabourin, R., Bortolozzi, F., & Suen, C. Y. (2003). A methodology for feature selection using multiobjective genetic algorithms for handwritten digit string recognition. International journal of pattern recognition and artificial intelligence, 17(06), 903–929. https://doi.org/10.1142/S021800140300271X

  118. [118] LeDell, E., & Poirier, S. (2020). H2o autoML: Scalable automatic machine learning. 7th ICML workshop on automated machine learning (pp. 1-16). International Machine Learning Society (IMLS). https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf

  119. [119] Mirjalili, S. (2015). How effective is the grey wolf optimizer in training multi-layer perceptrons. Applied intelligence, 43(1), 150–161. https://doi.org/10.1007/s10489-014-0645-7

  120. [120] Gharehchopogh, F. S., & Gholizadeh, H. (2019). A comprehensive survey: Whale optimization algorithm and its applications. Swarm and evolutionary computation, 48, 1–24. https://doi.org/10.1016/j.swevo.2019.03.004

  121. [121] Heidari, A. A., & Pahlavani, P. (2017). An efficient modified grey wolf optimizer with Lévy flight for optimization tasks. Applied soft computing, 60, 115–134. https://doi.org/10.1016/j.asoc.2017.06.044

  122. [122] Sutton, R. S., & Barto, A. G. (1999). Reinforcement learning: An introduction. MIT Press. https://mitpress.mit.edu/9780262039246/reinforcement-learning/

  123. [123] Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary computation, 10(2), 99–127. https://doi.org/10.1162/106365602320169811

  124. [124] Salimans, T., Ho, J., Chen, X., Sidor, S., & Sutskever, I. (2017). Evolution strategies as a scalable alternative to reinforcement learning. https://doi.org/10.1162/106365602320169811

  125. [125] Hansen, N. (2016). The CMA evolution strategy: A tutorial. https://doi.org/10.48550/arXiv.1604.00772

  126. [126] Jaderberg, M., Dalibard, V., Osindero, S., Czarnecki, W. M., Donahue, J., Razavi, A., … ., & Kavukcuoglu, K. (2017). Population based training of neural networks. https://doi.org/10.48550/arXiv.1711.09846

  127. [127] Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of machine learning research, 7, 1–30. https://www.researchgate.net/publication/220320196

  128. [128] García, S., Fernández, A., Luengo, J., & Herrera, F. (2010). Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information sciences, 180(10), 2044–2064. https://doi.org/10.1016/j.ins.2009.12.010

  129. [129] Shahriari, B., Swersky, K., Wang, Z., Adams, R. P., & de Freitas, N. (2016). Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1), 148–175. https://doi.org/10.1109/JPROC.2015.2494218

  130. [130] Falkner, S., Klein, A., & Hutter, F. (2018). BOHB: Robust and efficient hyperparameter optimization at scale. Proceedings of the 35th international conference on machine learning (pp. 1437–1446). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v80/falkner18a.html

  131. [131] Agrawal, T., & Choudhary, P. (2022). Metaheuristic optimization algorithms. Morgan Kaufmann. https://www.sciencedirect.com/book/edited-volume/9780443139253/metaheuristic-optimization-algorithms

  132. [132] Jin, Y. (2011). Surrogate-assisted evolutionary computation: Recent advances and future challenges. Swarm and evolutionary computation, 1(2), 61–70. https://doi.org/10.1016/j.swevo.2011.05.001

  133. [133] Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. (2017). LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems, 28(10), 2222–2232. https://doi.org/10.1109/TNNLS.2016.2582924

  134. [134] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … ., & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems (Vol. 30, PP. 5998–6008). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

  135. [135] Tekkali, C., & Natarajan, K. (2023). Smart fraud detection in E-transactions using synthetic minority oversampling and binary Harris Hawks optimization. Computers, materials, & continua, 75(2), 3171. https://doi.org/10.32604/cmc.2023.036865

  136. [136] Ma, L., Liu, Y., Zhang, X., Ye, Y., Yin, G., & Johnson, B. A. (2019). Deep learning in remote sensing applications: A meta-analysis and review. ISPRS journal of photogrammetry and remote sensing, 152, 166–177. https://doi.org/10.1016/j.isprsjprs.2019.04.015

  137. [137] Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A. A., Ciompi, F., Ghafoorian, M., … ., & Sánchez, C. I. (2017). A survey on deep learning in medical image analysis. Medical image analysis, 42, 60–88. https://doi.org/10.1016/j.media.2017.07.005

  138. [138] Liashchynskyi, P., & Liashchynskyi, P. (2019). Grid search, random search, genetic algorithm: A big comparison for NAS. https://doi.org/10.48550/arXiv.1912.06059

  139. [139] Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE transactions on evolutionary computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893

  140. [140] Zhang, D., Mishra, S., Brynjolfsson, E., Etchemendy, J., Ganguli, D., Grosz, B., … ., & Perrault, R. (2024). The 2024 AI index report. https://hai.stanford.edu/ai-index/2024-ai-index-report?hl=en-US

  141. [141] Wistuba, M., Rawat, A., & Pedapati, T. (2019). A survey on neural architecture search. https://doi.org/10.48550/arXiv.1905.01392

  142. [142] Cobo, M. J., López-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2011). Science mapping software tools: Review, analysis, and cooperative study among tools. Journal of the american society for information science and technology, 62(7), 1382–1402. https://doi.org/10.1002/asi.21525

  143. [143] Osaba, E., Yang, X.-S., & Del Ser, J. (2020). Traveling salesman problem: A perspective review of recent research and new results with bio-inspired metaheuristics. In Nature-inspired computation and swarm intelligence (pp. 135–164). Academic Press. https://doi.org/10.1016/B978-0-12-819714-1.00020-8

  144. [144] He, X., Zhao, K., & Chu, X. (2021). AutoML: A survey of the state-of-the-art. Knowledge-based systems, 212, 106622. https://doi.org/10.1016/j.knosys.2020.106622

  145. [145] Cai, H., Zhu, L., & Han, S. (2018). Proxylessnas: Direct neural architecture search on target task and hardware. https://doi.org/10.48550/arXiv.1812.00332

  146. [146] Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 3645–3650). Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1355

  147. [147] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … ., & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems (Vol. 33, pp. 1877–1901). Neural Information Processing Systems Foundation. https://dl.acm.org/doi/abs/10.5555/3495724.3495883

  148. [148] Ying, C., Klein, A., Christiansen, E., Real, E., Murphy, K., & Hutter, F. (2019). NAS-bench-101: Towards reproducible neural architecture search. Proceedings of the 36th international conference on machine learning (pp. 7105–7114). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v97/ying19a.html

  149. [149] Zela, A., Siems, J., & Hutter, F. (2020). Nas-bench-1shot1: Benchmarking and dissecting one-shot neural architecture search. https://doi.org/10.48550/arXiv.2001.10422

  150. [150] Wong, C., Houlsby, N., Lu, Y., & Gesmundo, A. (2018). Transfer learning with neural autoML. Advances in neural information processing systems (pp. 8356–8365). Neural Information Processing Systems Foundation. https://proceedings.neurips.cc/paper_files/paper/2018/hash/bdb3c278f45e6734c35733d24299d3f4-Abstract.html

  151. [151] Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., & Meger, D. (2018). Deep reinforcement learning that matters. Proceedings of the AAAI conference on artificial intelligence (pp. 3207–3214). Association for the Advancement of Artificial Intelligence (AAAI). https://doi.org/10.1609/aaai.v32i1.11694

  152. [152] Lindauer, M., & Hutter, F. (2020). Best practices for scientific research on neural architecture search. Journal of machine learning research, 21(243), 1–18. http://jmlr.org/papers/v21/20-056.html

  153. [153] He, J., & Yao, X. (2001). Drift analysis and average time complexity of evolutionary algorithms. Artificial intelligence, 127(1), 57–85. https://doi.org/10.1016/S0004-3702(01)00058-3

  154. [154] Dong, X., & Yang, Y. (2020). Nas-bench-201: Extending the scope of reproducible neural architecture search. https://doi.org/10.48550/arXiv.2001.00326

  155. [155] Siems, J., Zimmer, L., Zela, A., Lukasik, J., Keuber, M., & Hutter, F. (2021). NAS-bench-301 and the case for surrogate benchmarks for neural architecture search. International conference on learning representations (PP. 1-11). OpenReview. https://ml.informatik.uni-freiburg.de/wp-content/uploads/papers/20-NIPS_WML-NB301.pdf

  156. [156] Mellor, J., Turner, J., Storkey, A., & Crowley, E. J. (2021). Neural architecture search without training. Proceedings of the 38th international conference on machine learning (pp. 7588–7598). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v139/mellor21a.html

  157. [157] Chen, W., Gong, X., & Wang, Z. (2021). Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. https://doi.org/10.48550/arXiv.2102.11535

  158. [158] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. A., Lacroix, T., … ., & Lample, G. (2023). LLaMA: Open and efficient foundation language models. https://doi.org/10.48550/arXiv.2302.13971

  159. [159] Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., ... & Chen, W. (2022). Lora: Low-rank adaptation of large language models. International conference on learning representations (Iclr) (Vol. 1, No. 2, p. 3). https://arxiv.org/pdf/2106.09685v1/1000

  160. [160] McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th international conference on artificial intelligence and statistics (AISTATS 2017) (pp. 1273–1282). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v54/mcmahan17a.html

  161. [161] Schwartz, R., Dodge, J., Smith, N. A., & Etzioni, O. (2020). Green AI. Communication of the ACM, 63(12), 54–63. https://doi.org/10.1145/3381831

  162. [162] Zhang, G. (2011). Quantum-inspired evolutionary algorithms: A survey and empirical study. Journal of heuristics, 17(3), 303–351. https://doi.org/10.1007/s10732-010-9136-0

  163. [163] Aleti, A., & Moser, I. (2016). A systematic literature review of adaptive parameter control methods for evolutionary algorithms. ACM computing surveys, 49(3), 1–35. https://doi.org/10.1145/2996355

  164. [164] Li, K., Fialho, Á., Kwong, S., & Zhang, Q. (2014). Adaptive operator selection with bandits for a multiobjective evolutionary algorithm based on decomposition. IEEE transactions on evolutionary computation, 18(1), 114–130. https://doi.org/10.1109/TEVC.2013.2239648

  165. [165] Gaier, A., & Ha, D. (2019). Weight agnostic neural networks. Advances in neural information processing systems (Vol. 32, PP. 5365–5379). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2019/hash/e98741479a7b998f88b8f8c9f0b6b6f1-Abstract.html

  166. [166] Thornton, C., Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2013). Auto-weka: Combined selection and hyperparameter optimization of classification algorithms. Proceedings of the 19th ACM sigkdd international conference on knowledge discovery and data mining (pp. 847–855). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/2487575.2487629

  167. [167] Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., & Hutter, F. (2015). Efficient and robust automated machine learning. Advances in neural information processing systems (Vol. 28, PP. 2962–2970). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2015/hash/11d0e6287202fced83f79975ec59a3a6-Abstract.html

Published

2025-06-15

How to Cite

Ebrahimzadeh, F. (2025). Metaheuristic Optimization Algorithms in Artificial Intelligence: A Comprehensive Systematic Review of Neural Architecture Search, Hyperparameter Optimization, and Intelligent Feature Engineering. Metaheuristic Algorithms With Applications, 2(3), 236-262. https://doi.org/10.48313/maa.v2i3.49

Similar Articles

11-16 of 16

You may also start an advanced similarity search for this article.