Skip to the content.

講義資料

講義のスライドはこちら(閲覧のみ)

ソースコード一覧

説明 ファイル Colab
  01_01_intro_handson.ipynb Open in Colab
  01_02_simple_linear_ols_regression.ipynb Open in Colab
  01_03_cucumber_dataset.ipynb Open in Colab
  01_04_feature_map_HOG.ipynb Open in Colab
  01_05_linear_ols_regression.ipynb Open in Colab
  01_06_overfit_linear_model.ipynb Open in Colab
  01_07_learning_curve_overfitting.ipynb Open in Colab
  01_08_l2_regularized_linear_regression.ipynb Open in Colab
  01_09_validation_curve.ipynb Open in Colab
  01_10_hyperparameter_search.ipynb Open in Colab
  01_11_preliminary_probability.ipynb Open in Colab
  01_12_quantile_regression.ipynb Open in Colab
  01_13_binary_classification.ipynb Open in Colab
  01_14_multiclass_classification.ipynb Open in Colab
  01_15_logsumexp.ipynb Open in Colab
  01_16_neural_network.ipynb Open in Colab
  01_17_neural_network_training.ipynb Open in Colab
  01_18_kNN.ipynb Open in Colab
  01_19_gaussian_kernel.ipynb Open in Colab
  01_20_unstructured_data.ipynb Open in Colab
  01_21_deep_learning.ipynb Open in Colab
  01_22_vanishing_gradient.ipynb Open in Colab
  01_23_loss_landscape.ipynb Open in Colab
  01_24_interpolating_predictors.ipynb Open in Colab
  01_25_curse_of_dimensionality.ipynb Open in Colab
  01_26_scale_distance_and_representation.ipynb Open in Colab
  01_27_similarity_learning.ipynb Open in Colab
  01_28_softmax_with_temperature.ipynb Open in Colab
  01_29_distribution_shift.ipynb Open in Colab
  01_30_class_imbalanced_classification.ipynb Open in Colab
  02_02_image_generation.ipynb Open in Colab
  02_03_upasmpling_convolution.ipynb Open in Colab
  02_04_image_object_removal.ipynb Open in Colab
  02_05_natural_language_processing_base_tasks.ipynb Open in Colab
  04_01_Bayes_parameter_distributions.ipynb Open in Colab
  07_01_Gaussian_mixture.ipynb Open in Colab
  08_01_network_data.ipynb Open in Colab
  09_01_tokenizers.ipynb Open in Colab

参考文献一覧

※全てを片っ端から読もうとすることはお勧めしません(教科書的な文献を除く)。知りたい事柄に応じて必要な箇所を読みましょう。

AIと社会-AI関連法規・AI関連市場 (.bib)

  1. 100,000 H100 clusters: power, network topology, ethernet vs infiniband, reliability, failures, checkpointing. (2024). SemiAnalysis. https://semianalysis.com/2024/06/17/100000-h100-clusters-power-network/
  2. Bloomberg. (2024). Generative AI 2024 report: assessing opportunities and disruptions in an evolving trillion-dollar market. https://www.bloomberg.com/professional/products/bloomberg-terminal/research/bloomberg-intelligence/download/generative-ai-2024-report/
  3. Hu, K., & Hu, K. (2023). ChatGPT sets record for fastest-growing user base - analyst note. Reuters. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
  4. NVIDIA announces financial results for first quarter fiscal 2026. (2025). NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-first-quarter-fiscal-2026
  5. 人工知能関連技術の研究開発及び活用の推進に関する法律, (2025). https://laws.e-gov.go.jp/law/507AC0000000053
  6. 日本放送協会. (2025). 「性的ディープフェイク」で行政罰の条例改正案を可決 鳥取県. NHKニュース. https://www3.nhk.or.jp/news/html/20250630/k10014848661000.html
  7. 特許法. Retrieved July 18, 2025, from https://laws.e-gov.go.jp/law/334AC0000000121#Mp-Ch_1
  8. 特許庁. (2019). 特許・実用新案審査ハンドブック 附属書B 第1章 コンピュータソフトウエア関連発明. https://www.jpo.go.jp/system/laws/rule/guideline/patent/handbook_shinsa/document/index/app_b1.pdf
  9. 文化審議会著作権分科会法制度小委員会. (2024). AIと著作権に関する考え方について. https://www.bunka.go.jp/seisaku/bunkashingikai/chosakuken/pdf/94037901_01.pdf
  10. 著作権法第三十条第四項, (2019). https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_3-Ss_5-At_30_4
  11. 著作権法第十条第三項第三号. Retrieved July 17, 2025, from https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_1
  12. 著作権法の一部を改正する法律(平成30年法律第30号)について | 文化庁. Retrieved July 17, 2025, from https://www.bunka.go.jp/seisaku/chosakuken/hokaisei/h30_hokaisei/

AIと社会-倫理・安全・ガバナンス (.bib)

  1. Angwin, J., Larson, J., Mattu, S., Kirchner, L., & ProPublica. (2016). Machine Bias. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
  2. Chan, C., Ginosar, S., Zhou, T., & Efros, A. (2019). Everybody dance now. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 5932–5941. https://doi.org/10.1109/ICCV.2019.00603
  3. Chapagain, D., Kshetri, N., & Aryal, B. (2024). Deepfake disasters: a comprehensive review of technology, ethical concerns, countermeasures, and societal implications. 2024 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), 1–9. https://doi.org/10.1109/ETNCC63262.2024.10767452
  4. Chouldechova, A. (2017). Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data, 5(2), 153–163. https://doi.org/10.1089/big.2016.0047
  5. Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., & Huq, A. (2017). Algorithmic decision making and the cost of fairness. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 797–806. https://doi.org/10.1145/3097983.3098095
  6. delving_2025_july.png. (2025). GitHub. https://github.com/berenslab/llm-excess-vocab/blob/main/figures/post-publication-updates/delving_2025_july.png
  7. Fernando, T., Priyasad, D., Sridharan, S., Ross, A., & Fookes, C. (2025). Face deepfakes: a comprehensive review. https://doi.org/10.48550/arXiv.2502.09812
  8. 会計担当が38億円を詐欺グループに送金、ビデオ会議のCFOは偽物. (2024). CNN.co.jp. https://www.cnn.co.jp/world/35214839.html
  9. Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. In C. H. Papadimitriou (Ed.), 8th Innovations in Theoretical Computer Science Conference (Vol. 67, pp. 43:1–43:23). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Germany. https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2017.43
  10. Kobak, D., González-Márquez, R., Horvát, E.-Á., & Lause, J. (2025). Delving into LLM-assisted writing in biomedical publications through excess vocabulary. Science Advances, 11(27), eadt3813. https://doi.org/10.1126/sciadv.adt3813
  11. Li, P., Yang, J., Islam, M. A., & Ren, S. (2025). Making AI less ’thirsty.’ Commun. ACM, 68(7), 54–61. https://doi.org/10.1145/3724499
  12. Manzini, A., Keeling, G., Alberts, L., Vallor, S., Morris, M. R., & Gabriel, I. (2024). The code that binds us: navigating the appropriateness of human-AI assistant relationships. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 943–957. https://doi.org/10.1609/aies.v7i1.31694
  13. Marx, K. (1959). Economic & Philosophic Manuscripts of 1844 (M. Milligan, Tran.). Progress Publishers. https://www.marxists.org/archive/marx/works/1844/manuscripts/preface.htm
  14. Mata v. Avianca, Inc. (Number 1:22-cv-01461). (Number). District Court, S.D. New York. Retrieved July 13, 2025, from https://www.courtlistener.com/docket/63107798/mata-v-avianca-inc/
  15. Moffatt v. Air Canada. (2024). In CanLII (Vol. 149, Number SC-2023-005609). BCCRT. https://canlii.ca/t/k2spq
  16. OWASP. (2024). OWASP Top 10 for LLM applications & generative AI (Technical Report OWASP PDF v4.2.0a 20241114-202703; Number OWASP PDF v4.2.0a 20241114-202703).
  17. 生成AIでランサムウェアを作成した容疑者の摘発事例を考察. (2025). Trend Micro. https://www.trendmicro.com/ja_jp/jp-security/24/e/breaking-securitynews-20240529-02.html
  18. 生成AI悪用し楽天モバイルに不正アクセス、1000件以上の回線入手し転売か…容疑で中高生3人逮捕. (2025). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20250226-OYT1T50205/
  19. 生成AI悪用しウイルス作成、有罪判決…IT知識なくとも「1か月ぐらいで簡単に作れた」. (2024). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20241025-OYT1T50209/
  20. Storchan, V., Kumar, R., Chowdhury, R., Goldfarb-Tarrant, S., & Cattell, S. (2024). Generative AI red teaming challenge: transparency report [Technical Report]. DEF CON. https://drive.google.com/file/d/1JqpbIP6DNomkb32umLoiEPombK2-0Rc-/view
  21. Tabassi, E. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology. https://doi.org/10.6028/nist.ai.100-1
  22. Weidinger, L., Barnhart, J., Brennan, J., Butterfield, C., Young, S., Hawkins, W., Hendricks, L. A., Comanescu, R., Chang, O., Rodriguez, M., Beroshi, J., Bloxwich, D., Proleev, L., Chen, J., Farquhar, S., Ho, L., Gabriel, I., Dafoe, A., & Isaac, W. (2024). Holistic safety and responsibility evaluations of advanced AI models. https://doi.org/10.48550/arXiv.2404.14068
  23. Writers Guild of America. (2023). Summary of the 2023 WGA MBA. WGA Contract 2023. https://www.wgacontract2023.org/the-campaign/summary-of-the-2023-wga-mba

AIと社会-実社会応用の事例-分類 (.bib)

    AIと社会-実社会応用の事例-回帰 (.bib)

    1. DeepETA: How Uber Predicts Arrival Times Using Deep Learning. (2022). Uber Blog. https://www.uber.com/blog/deepeta-how-uber-predicts-arrival-times/
    2. GO株式会社. (2023). タクシーアプリ『GO』の データ基盤の全体像. https://www.slideshare.net/slideshow/ss-258369181/258369181
    3. Ye, P., Qian, J., Chen, J., Wu, C.-hung, Zhou, Y., De Mars, S., Yang, F., & Zhang, L. (2018). Customized Regression Model for Airbnb Dynamic Pricing. 932–940. https://doi.org/10.1145/3219819.3219830

    AIと社会-実社会応用の事例 (.bib)

    1. DeepETA: How Uber Predicts Arrival Times Using Deep Learning. (2022). Uber Blog. https://www.uber.com/blog/deepeta-how-uber-predicts-arrival-times/
    2. GO株式会社. (2023). タクシーアプリ『GO』の データ基盤の全体像. https://www.slideshare.net/slideshow/ss-258369181/258369181
    3. Ye, P., Qian, J., Chen, J., Wu, C.-hung, Zhou, Y., De Mars, S., Yang, F., & Zhang, L. (2018). Customized Regression Model for Airbnb Dynamic Pricing. 932–940. https://doi.org/10.1145/3219819.3219830

    AIと社会-法律・AI関連市場 (.bib)

      AIと社会-生産性と職務の変化 (.bib)

      1. Cui, Z. (K., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The effects of generative AI on high-skilled work: evidence from three field experiments with software developers (Number 4945566) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4945566
      2. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models. https://doi.org/10.48550/arXiv.2303.10130
      3. Filippucci, F., Gal, P., Jona-Lasinio, C., Leandro, A., & Nicoletti, G. (2024). The impact of artificial intelligence on productivity, distribution and growth: key mechanisms, initial evidence and policy challenges (No.15; OECD Artificial Intelligence Papers, Number 15). OECD Publishing. https://doi.org/10.1787/8d900037-en
      4. Filippucci, F., Gal, P., & Schief, M. (2024). Miracle or myth? Assessing the macroeconomic productivity gains from artificial intelligence (No.29; OECD Artificial Intelligence Papers, Number 29). OECD Publishing. https://doi.org/10.1787/b524a072-en
      5. Handa, K., Tamkin, A., McCain, M., Huang, S., Durmus, E., Heck, S., Mueller, J., Hong, J., Ritchie, S., Belonax, T., Troy, K. K., Amodei, D., Kaplan, J., Clark, J., & Ganguli, D. (2025). Which economic tasks are performed with AI? Evidence from millions of Claude conversations. https://doi.org/10.48550/arXiv.2503.04761

      AIと社会-生産性と職業への影響 (.bib)

      1. Cui, Z. (K., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The effects of generative ai on high-skilled work: evidence from three field experiments with software developers (Number 4945566) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4945566

      AIと社会-生産性・職務変化・雇用市場 (.bib)

      1. Brynjolfsson, E., Li, D., & Raymond, L. R. (2023). Generative AI at work (Number 31161) [Working Paper]. https://doi.org/10.3386/w31161
      2. Brynjolfsson, E., Mitchell, T., & Rock, D. (2018). What can machines learn, and what does it mean for occupations and the economy? AEA Papers and Proceedings, 108, 43–47. https://doi.org/10.1257/pandp.20181019
      3. Cazzaniga, M. (2024). Gen-AI: artificial intelligence and the future of work. Staff Discussion Notes, 2024(001), 1. https://doi.org/10.5089/9798400262548.006
      4. Cui, Z. (K., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The effects of generative AI on high-skilled work: evidence from three field experiments with software developers (Number 4945566) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4945566
      5. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: an early look at the labor market impact potential of large language models. https://doi.org/10.48550/arXiv.2303.10130
      6. Felten, E. W., Raj, M., & Seamans, R. (2023). How will Language Modelers like ChatGPT Affect Occupations and Industries? (Number 4375268) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4375268
      7. Felten, E. W., Raj, M., & Seamans, R. (2018). A method to link advances in artificial intelligence to occupational abilities. AEA Papers and Proceedings, 108, 54–57. https://doi.org/10.1257/pandp.20181021
      8. Felten, E. W., Raj, M., & Seamans, R. (2023). Occupational heterogeneity in exposure to Generative AI (Number 4414065) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4414065
      9. Filippucci, F., Gal, P., Jona-Lasinio, C., Leandro, A., & Nicoletti, G. (2024). The impact of artificial intelligence on productivity, distribution and growth: key mechanisms, initial evidence and policy challenges (No.15; OECD Artificial Intelligence Papers, Number 15). OECD Publishing. https://doi.org/10.1787/8d900037-en
      10. Filippucci, F., Gal, P., & Schief, M. (2024). Miracle or myth? Assessing the macroeconomic productivity gains from artificial intelligence (No.29; OECD Artificial Intelligence Papers, Number 29). OECD Publishing. https://doi.org/10.1787/b524a072-en
      11. Handa, K., Tamkin, A., McCain, M., Huang, S., Durmus, E., Heck, S., Mueller, J., Hong, J., Ritchie, S., Belonax, T., Troy, K. K., Amodei, D., Kaplan, J., Clark, J., & Ganguli, D. (2025). Which economic tasks are performed with AI? Evidence from millions of Claude conversations. https://doi.org/10.48550/arXiv.2503.04761
      12. Lane, M. (2024). Who will be the workers most affected by AI? A closer look at the impact of AI on women, low-skilled workers and other groups (Technical Report No.26; Number 26). OECD Publishing. https://doi.org/10.1787/14dc6f89-en
      13. Lipsey, R. G., Carlaw, K. I., & Bekar, C. T. (2005). Economic Transformations: General Purpose Technologies and Long-Term Economic Growth. Oxford University Press. https://doi.org/10.1093/oso/9780199285648.001.0001
      14. Mäkelä, E., & Stephany, F. (2025). Complement or substitute? How AI increases the demand for human skills (Number 5153230) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.5153230
      15. Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654), 187–192. https://doi.org/10.1126/science.adh2586
      16. Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. (2023). The impact of AI on developer productivity: evidence from GitHub Copilot. https://doi.org/10.48550/arXiv.2302.06590
      17. Webb, M. (2019). The Impact of Artificial Intelligence on the Labor Market (Number 3482150) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.3482150
      18. World Economic Forum. (2025). Future of Jobs Report 2025 [Insight Report]. https://www.weforum.org/publications/the-future-of-jobs-report-2025/digest/
      19. 尭之 新田. 生成AIが描く日本の職業の明暗とその対応策.

      AIと社会 (.bib)

      1. 100,000 H100 clusters: power, network topology, ethernet vs infiniband, reliability, failures, checkpointing. (2024). SemiAnalysis. https://semianalysis.com/2024/06/17/100000-h100-clusters-power-network/
      2. Angwin, J., Larson, J., Mattu, S., Kirchner, L., & ProPublica. (2016). Machine Bias. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
      3. Bloomberg. (2024). Generative AI 2024 report: assessing opportunities and disruptions in an evolving trillion-dollar market. https://www.bloomberg.com/professional/products/bloomberg-terminal/research/bloomberg-intelligence/download/generative-ai-2024-report/
      4. Brynjolfsson, E., Li, D., & Raymond, L. R. (2023). Generative AI at work (Number 31161) [Working Paper]. https://doi.org/10.3386/w31161
      5. Brynjolfsson, E., Mitchell, T., & Rock, D. (2018). What can machines learn, and what does it mean for occupations and the economy? AEA Papers and Proceedings, 108, 43–47. https://doi.org/10.1257/pandp.20181019
      6. Cazzaniga, M. (2024). Gen-AI: artificial intelligence and the future of work. Staff Discussion Notes, 2024(001), 1. https://doi.org/10.5089/9798400262548.006
      7. Chan, C., Ginosar, S., Zhou, T., & Efros, A. (2019). Everybody dance now. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 5932–5941. https://doi.org/10.1109/ICCV.2019.00603
      8. Chapagain, D., Kshetri, N., & Aryal, B. (2024). Deepfake disasters: a comprehensive review of technology, ethical concerns, countermeasures, and societal implications. 2024 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), 1–9. https://doi.org/10.1109/ETNCC63262.2024.10767452
      9. Chouldechova, A. (2017). Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data, 5(2), 153–163. https://doi.org/10.1089/big.2016.0047
      10. Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., & Huq, A. (2017). Algorithmic decision making and the cost of fairness. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 797–806. https://doi.org/10.1145/3097983.3098095
      11. Cui, Z. (K., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The effects of generative AI on high-skilled work: evidence from three field experiments with software developers (Number 4945566) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4945566
      12. DeepETA: How Uber Predicts Arrival Times Using Deep Learning. (2022). Uber Blog. https://www.uber.com/blog/deepeta-how-uber-predicts-arrival-times/
      13. delving_2025_july.png. (2025). GitHub. https://github.com/berenslab/llm-excess-vocab/blob/main/figures/post-publication-updates/delving_2025_july.png
      14. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: an early look at the labor market impact potential of large language models. https://doi.org/10.48550/arXiv.2303.10130
      15. Felten, E. W., Raj, M., & Seamans, R. (2023). How will Language Modelers like ChatGPT Affect Occupations and Industries? (Number 4375268) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4375268
      16. Felten, E. W., Raj, M., & Seamans, R. (2018). A method to link advances in artificial intelligence to occupational abilities. AEA Papers and Proceedings, 108, 54–57. https://doi.org/10.1257/pandp.20181021
      17. Felten, E. W., Raj, M., & Seamans, R. (2023). Occupational heterogeneity in exposure to Generative AI (Number 4414065) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4414065
      18. Fernando, T., Priyasad, D., Sridharan, S., Ross, A., & Fookes, C. (2025). Face deepfakes: a comprehensive review. https://doi.org/10.48550/arXiv.2502.09812
      19. Filippucci, F., Gal, P., Jona-Lasinio, C., Leandro, A., & Nicoletti, G. (2024). The impact of artificial intelligence on productivity, distribution and growth: key mechanisms, initial evidence and policy challenges (No.15; OECD Artificial Intelligence Papers, Number 15). OECD Publishing. https://doi.org/10.1787/8d900037-en
      20. Filippucci, F., Gal, P., & Schief, M. (2024). Miracle or myth? Assessing the macroeconomic productivity gains from artificial intelligence (No.29; OECD Artificial Intelligence Papers, Number 29). OECD Publishing. https://doi.org/10.1787/b524a072-en
      21. GO株式会社. (2023). タクシーアプリ『GO』の データ基盤の全体像. https://www.slideshare.net/slideshow/ss-258369181/258369181
      22. Handa, K., Tamkin, A., McCain, M., Huang, S., Durmus, E., Heck, S., Mueller, J., Hong, J., Ritchie, S., Belonax, T., Troy, K. K., Amodei, D., Kaplan, J., Clark, J., & Ganguli, D. (2025). Which economic tasks are performed with AI? Evidence from millions of Claude conversations. https://doi.org/10.48550/arXiv.2503.04761
      23. Hu, K., & Hu, K. (2023). ChatGPT sets record for fastest-growing user base - analyst note. Reuters. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
      24. 会計担当が38億円を詐欺グループに送金、ビデオ会議のCFOは偽物. (2024). CNN.co.jp. https://www.cnn.co.jp/world/35214839.html
      25. Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. In C. H. Papadimitriou (Ed.), 8th Innovations in Theoretical Computer Science Conference (Vol. 67, pp. 43:1–43:23). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Germany. https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2017.43
      26. Kobak, D., González-Márquez, R., Horvát, E.-Á., & Lause, J. (2025). Delving into LLM-assisted writing in biomedical publications through excess vocabulary. Science Advances, 11(27), eadt3813. https://doi.org/10.1126/sciadv.adt3813
      27. Lane, M. (2024). Who will be the workers most affected by AI? A closer look at the impact of AI on women, low-skilled workers and other groups (Technical Report No.26; Number 26). OECD Publishing. https://doi.org/10.1787/14dc6f89-en
      28. Li, P., Yang, J., Islam, M. A., & Ren, S. (2025). Making AI less ’thirsty.’ Commun. ACM, 68(7), 54–61. https://doi.org/10.1145/3724499
      29. Lipsey, R. G., Carlaw, K. I., & Bekar, C. T. (2005). Economic Transformations: General Purpose Technologies and Long-Term Economic Growth. Oxford University Press. https://doi.org/10.1093/oso/9780199285648.001.0001
      30. Mäkelä, E., & Stephany, F. (2025). Complement or substitute? How AI increases the demand for human skills (Number 5153230) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.5153230
      31. Manzini, A., Keeling, G., Alberts, L., Vallor, S., Morris, M. R., & Gabriel, I. (2024). The code that binds us: navigating the appropriateness of human-AI assistant relationships. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 943–957. https://doi.org/10.1609/aies.v7i1.31694
      32. Marx, K. (1959). Economic & Philosophic Manuscripts of 1844 (M. Milligan, Tran.). Progress Publishers. https://www.marxists.org/archive/marx/works/1844/manuscripts/preface.htm
      33. Mata v. Avianca, Inc. (Number 1:22-cv-01461). (Number). District Court, S.D. New York. Retrieved July 13, 2025, from https://www.courtlistener.com/docket/63107798/mata-v-avianca-inc/
      34. Moffatt v. Air Canada. (2024). In CanLII (Vol. 149, Number SC-2023-005609). BCCRT. https://canlii.ca/t/k2spq
      35. Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654), 187–192. https://doi.org/10.1126/science.adh2586
      36. NVIDIA announces financial results for first quarter fiscal 2026. (2025). NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-first-quarter-fiscal-2026
      37. OWASP. (2024). OWASP Top 10 for LLM applications & generative AI (Technical Report OWASP PDF v4.2.0a 20241114-202703; Number OWASP PDF v4.2.0a 20241114-202703).
      38. Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. (2023). The impact of AI on developer productivity: evidence from GitHub Copilot. https://doi.org/10.48550/arXiv.2302.06590
      39. 人工知能関連技術の研究開発及び活用の推進に関する法律, (2025). https://laws.e-gov.go.jp/law/507AC0000000053
      40. 日本放送協会. (2025). 「性的ディープフェイク」で行政罰の条例改正案を可決 鳥取県. NHKニュース. https://www3.nhk.or.jp/news/html/20250630/k10014848661000.html
      41. 生成AIでランサムウェアを作成した容疑者の摘発事例を考察. (2025). Trend Micro. https://www.trendmicro.com/ja_jp/jp-security/24/e/breaking-securitynews-20240529-02.html
      42. 生成AI悪用し楽天モバイルに不正アクセス、1000件以上の回線入手し転売か…容疑で中高生3人逮捕. (2025). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20250226-OYT1T50205/
      43. 生成AI悪用しウイルス作成、有罪判決…IT知識なくとも「1か月ぐらいで簡単に作れた」. (2024). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20241025-OYT1T50209/
      44. Storchan, V., Kumar, R., Chowdhury, R., Goldfarb-Tarrant, S., & Cattell, S. (2024). Generative AI red teaming challenge: transparency report [Technical Report]. DEF CON. https://drive.google.com/file/d/1JqpbIP6DNomkb32umLoiEPombK2-0Rc-/view
      45. Tabassi, E. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology. https://doi.org/10.6028/nist.ai.100-1
      46. Tamkin, A., McCain, M., Handa, K., Durmus, E., Lovitt, L., Rathi, A., Huang, S., Mountfield, A., Hong, J., Ritchie, S., Stern, M., Clarke, B., Goldberg, L., Sumers, T. R., Mueller, J., McEachen, W., Mitchell, W., Carter, S., Clark, J., … Ganguli, D. (2024). Clio: privacy-preserving insights into real-world ai use. https://doi.org/10.48550/arXiv.2412.13678
      47. 特許法. Retrieved July 18, 2025, from https://laws.e-gov.go.jp/law/334AC0000000121#Mp-Ch_1
      48. 特許庁. (2019). 特許・実用新案審査ハンドブック 附属書B 第1章 コンピュータソフトウエア関連発明. https://www.jpo.go.jp/system/laws/rule/guideline/patent/handbook_shinsa/document/index/app_b1.pdf
      49. Webb, M. (2019). The Impact of Artificial Intelligence on the Labor Market (Number 3482150) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.3482150
      50. Weidinger, L., Barnhart, J., Brennan, J., Butterfield, C., Young, S., Hawkins, W., Hendricks, L. A., Comanescu, R., Chang, O., Rodriguez, M., Beroshi, J., Bloxwich, D., Proleev, L., Chen, J., Farquhar, S., Ho, L., Gabriel, I., Dafoe, A., & Isaac, W. (2024). Holistic safety and responsibility evaluations of advanced AI models. https://doi.org/10.48550/arXiv.2404.14068
      51. 文化審議会著作権分科会法制度小委員会. (2024). AIと著作権に関する考え方について. https://www.bunka.go.jp/seisaku/bunkashingikai/chosakuken/pdf/94037901_01.pdf
      52. World Economic Forum. (2025). Future of Jobs Report 2025 [Insight Report]. https://www.weforum.org/publications/the-future-of-jobs-report-2025/digest/
      53. Writers Guild of America. (2023). Summary of the 2023 WGA MBA. WGA Contract 2023. https://www.wgacontract2023.org/the-campaign/summary-of-the-2023-wga-mba
      54. 尭之 新田. 生成AIが描く日本の職業の明暗とその対応策.
      55. Ye, P., Qian, J., Chen, J., Wu, C.-hung, Zhou, Y., De Mars, S., Yang, F., & Zhang, L. (2018). Customized Regression Model for Airbnb Dynamic Pricing. 932–940. https://doi.org/10.1145/3219819.3219830
      56. 著作権法第三十条第四項, (2019). https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_3-Ss_5-At_30_4
      57. 著作権法第十条第三項第三号. Retrieved July 17, 2025, from https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_1
      58. 著作権法の一部を改正する法律(平成30年法律第30号)について | 文化庁. Retrieved July 17, 2025, from https://www.bunka.go.jp/seisaku/chosakuken/hokaisei/h30_hokaisei/

      イントロダクション (.bib)

      1. 人工知能学会. (2023年5月版). AIマップβ2.0. https://www.ai-gakkai.or.jp/aimap/
      2. 神嶌 敏弘. (2019). 変わりゆく機械学習と変わらない機械学習. 日本物理学会誌, 74(1), 5–13. https://doi.org/10.11316/butsuri.74.1_5

      データサイエンス-中間解析 (.bib)

      1. O’Brien, P. C., & Fleming, T. R. (1979). A multiple testing procedure for clinical trials. Biometrics, 35(3), 549–556. https://doi.org/10.2307/2530245
      2. Proschan, M. A., Lan, K. K. G., & Wittes, J. T. (2006). Statistical Monitoring of Clinical Trials: A Unified Approach. Springer.

      データサイエンス (.bib)

      1. 北川 源四郎, 竹村 彰通, 赤穂 昭太郎, 今泉 允聡, 内田 誠一, 清 智也, 高野 渉, 辻 真吾, 原 尚幸, 久野 遼平, 松原 仁, 宮地 充子, 森畑 明昌, & 宿久 洋. (2023). 応用基礎としてのデータサイエンス AI×データ活用の実践. 講談社.
      2. O’Brien, P. C., & Fleming, T. R. (1979). A multiple testing procedure for clinical trials. Biometrics, 35(3), 549–556. https://doi.org/10.2307/2530245
      3. Proschan, M. A., Lan, K. K. G., & Wittes, J. T. (2006). Statistical Monitoring of Clinical Trials: A Unified Approach. Springer.

      データセット (.bib)

      1. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 1, 886–893 vol. 1. https://doi.org/10.1109/CVPR.2005.177
      2. De Cock, D. (2011). Ames, Iowa: alternative to the Boston housing data as an end of semester regression project. Journal of Statistics Education, 19(3), 8. https://doi.org/10.1080/10691898.2011.11889627
      3. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. https://doi.org/10.1109/CVPR.2009.5206848
      4. Forina, M., Armanino, C., Castino, M., & Ubigli, M. (1986). Multivariate data analysis as a discriminating method of the origin of wines. VITIS - Journal of Grapevine Research, 25(3), 189–189. https://doi.org/10.5073/vitis.1986.25.189-201
      5. ndl-lab/pdmocrdataset-part1. (2024). ndl-lab. https://github.com/ndl-lab/pdmocrdataset-part1
      6. Penedo, G., Malartic, Q., Hesslow, D., Cojocaru, R., Cappelli, A., Alobeidli, H., Pannier, B., Almazrouei, E., & Launay, J. (2023). The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. https://doi.org/10.48550/arXiv.2306.01116
      7. Taori, R., Gulrajani, I., Zhang, T., Dubois, Y., Li, X., Guestrin, C., Liang, P., & Hashimoto, T. B. (2023). Stanford Alpaca: An Instruction-following LLaMA model [Data set]. stanford_alpaca.
      8. workpiles. (2016). CUCUMBER-9. https://github.com/workpiles/CUCUMBER-9

      データマイニング-グラフマイニング (.bib)

      1. 石黒 勝彦, & 林 浩平. (2016). 関係データ学習. 講談社.
      2. 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.

      データマイニング-音声データマイニング (.bib)

        データマイニング (.bib)

        1. Rokach, L., & Maimon, O. (2014). Data Mining with Decision Trees: Theory and Applications (2nd ed., Vol. 81). WORLD SCIENTIFIC. https://doi.org/10.1142/9097
        2. 石黒 勝彦, & 林 浩平. (2016). 関係データ学習. 講談社.
        3. 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.

        数学-情報理論 (.bib)

        1. Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory (2nd ed). Wiley-Interscience.

        数学-数理最適化 (.bib)

        1. Defazio, A., Yang, X. A., Khaled, A., Mishchenko, K., Mehta, H., & Cutkosky, A. (2024, November 6). The road less scheduled. https://openreview.net/forum?id=0XeNkkENuI
        2. Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazaré, P.-E., Lomeli, M., Hosseini, L., & Jégou, H. (2025). The Faiss library. https://doi.org/10.48550/arXiv.2401.08281
        3. Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., & Wilson, A. G. (2018). Averaging weights leads to wider optima and better generalization. Proceedings of the Conference on Uncertainty in Artificial Intelligence.
        4. 金森 敬文, 鈴木 大慈, 竹内 一郎, & 佐藤 一誠. (2016). 機械学習のための連続最適化. 講談社.
        5. Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. Proceedings of 3rd International Conference for Learning Representations. http://arxiv.org/abs/1412.6980
        6. 鈴木 大慈. (2015). 確率的最適化. 講談社.
        7. 梅谷 俊治. (2020). しっかり学ぶ数理最適化 モデルからアルゴリズムまで. 講談社.
        8. 山下 信雄. (2015). 非線形計画法. 朝倉書店. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003278170
        9. Takami Sato. (2024). 最適化超入門. https://speakerdeck.com/tkm2261/zui-shi-hua-chao-ru-men
        10. 穴井 宏和, & 斉藤 努. (2015). 今日から使える!組合せ最適化 離散問題ガイドブック. 講談社.
        11. 穴井 宏和. (2013). 数理最適化の実践ガイド. 講談社.

        数学-確率統計 (.bib)

        1. Çinlar, E. (2011). Probability and Stochastics. Springer New York. https://doi.org/10.1007/978-0-387-87859-1
        2. Hall, P., Marron, J. S., & Neeman, A. (2005). Geometric representation of high dimension, low sample size data. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(3), 427–444. https://doi.org/10.1111/j.1467-9868.2005.00510.x
        3. Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
        4. Vershynin, R. (2018). High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press.
        5. Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint (1st ed.). Cambridge University Press. https://www.cambridge.org/core/product/identifier/9781108627771/type/book
        6. 竹村 彰通. (2020). 新装改訂版 現代数理統計学. 学術図書出版社.

        数学-線型代数・関数解析 (.bib)

        1. 赤穂 昭太郎. (2008). カーネル多変量解析―非線形データ解析の新しい展開. 岩波書店.
        2. 村上 正康, 稲葉 尚志, & 野沢 宗平. (1989). 演習 線形代数 (改訂版). 培風館.
        3. 福水 健次. (2010). カーネル法入門―正定値カーネルによるデータ解析. 朝倉書店.
        4. 黒田 成俊. (1980). 関数解析 (Number 15). 共立出版.
        5. Luenberger, D. G. (1997). Optimization by Vector Space Methods (1st ed.). John Wiley & Sons, Inc.
        6. Petersen, K. B., & Pedersen, M. S. (2012/nov). The Matrix Cookbook. Technical University of Denmark. http://www2.compute.dtu.dk/pubdb/pubs/3274-full.html

        数学-高次元現象 (.bib)

        1. Aggarwal, C. C., Hinneburg, A., & Keim, D. A. (2001). On the surprising behavior of distance metrics in high dimensional spaces. Proceedings of the 8th International Conference on Database Theory, 420–434.

        数学 (.bib)

        1. Aggarwal, C. C., Hinneburg, A., & Keim, D. A. (2001). On the surprising behavior of distance metrics in high dimensional spaces. Proceedings of the 8th International Conference on Database Theory, 420–434.
        2. 赤穂 昭太郎. (2008). カーネル多変量解析―非線形データ解析の新しい展開. 岩波書店.
        3. Çinlar, E. (2011). Probability and Stochastics. Springer New York. https://doi.org/10.1007/978-0-387-87859-1
        4. Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory (2nd ed). Wiley-Interscience.
        5. 村上 正康, 稲葉 尚志, & 野沢 宗平. (1989). 演習 線形代数 (改訂版). 培風館.
        6. Defazio, A., Yang, X. A., Khaled, A., Mishchenko, K., Mehta, H., & Cutkosky, A. (2024, November 6). The road less scheduled. https://openreview.net/forum?id=0XeNkkENuI
        7. Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazaré, P.-E., Lomeli, M., Hosseini, L., & Jégou, H. (2025). The Faiss library. https://doi.org/10.48550/arXiv.2401.08281
        8. 福水 健次. (2010). カーネル法入門―正定値カーネルによるデータ解析. 朝倉書店.
        9. Hall, P., Marron, J. S., & Neeman, A. (2005). Geometric representation of high dimension, low sample size data. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(3), 427–444. https://doi.org/10.1111/j.1467-9868.2005.00510.x
        10. 黒田 成俊. (1980). 関数解析 (Number 15). 共立出版.
        11. Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., & Wilson, A. G. (2018). Averaging weights leads to wider optima and better generalization. Proceedings of the Conference on Uncertainty in Artificial Intelligence.
        12. 金森 敬文, 鈴木 大慈, 竹内 一郎, & 佐藤 一誠. (2016). 機械学習のための連続最適化. 講談社.
        13. Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. Proceedings of 3rd International Conference for Learning Representations. http://arxiv.org/abs/1412.6980
        14. 鈴木 大慈. (2015). 確率的最適化. 講談社.
        15. Luenberger, D. G. (1997). Optimization by Vector Space Methods (1st ed.). John Wiley & Sons, Inc.
        16. 梅谷 俊治. (2020). しっかり学ぶ数理最適化 モデルからアルゴリズムまで. 講談社.
        17. Petersen, K. B., & Pedersen, M. S. (2012/nov). The Matrix Cookbook. Technical University of Denmark. http://www2.compute.dtu.dk/pubdb/pubs/3274-full.html
        18. 山下 信雄. (2015). 非線形計画法. 朝倉書店. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003278170
        19. Takami Sato. (2024). 最適化超入門. https://speakerdeck.com/tkm2261/zui-shi-hua-chao-ru-men
        20. Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
        21. Vershynin, R. (2018). High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press.
        22. Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint (1st ed.). Cambridge University Press. https://www.cambridge.org/core/product/identifier/9781108627771/type/book
        23. 穴井 宏和, & 斉藤 努. (2015). 今日から使える!組合せ最適化 離散問題ガイドブック. 講談社.
        24. 穴井 宏和. (2013). 数理最適化の実践ガイド. 講談社.
        25. 竹村 彰通. (2020). 新装改訂版 現代数理統計学. 学術図書出版社.

        機械学習-Bayesian methods-Gaussian processes (.bib)

        1. Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.

        機械学習-Bayesian methods-Stochastic Gradient Langevin Dynamics (.bib)

        1. Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.

        機械学習-Bayesian methods-Variational inference (.bib)

        1. Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
        2. Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056

        機械学習-Bayesian methods-全般 (.bib)

        1. Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
        2. 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.

        機械学習-Bayesian methods (.bib)

        1. Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
        2. 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
        3. Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
        4. Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
        5. Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
        6. Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.

        機械学習-Conformal prediction (.bib)

        1. Angelopoulos, A. N., & Bates, S. (2022). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. http://arxiv.org/abs/2107.07511
        2. Vovk, V., Gammerman, A., & Shafer, G. (2022). Algorithmic Learning in a Random World. Springer International Publishing. https://doi.org/10.1007/978-3-031-06649-8

        機械学習-Hyperparameter tuning (.bib)

        1. Bergstra, J., & Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13.

        機械学習-LLM-2024 (.bib)

          機械学習-LLM-2025 (.bib)

          1. Gemini Team, Google. (2025). Gemini 2.5: pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities. https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf

          機械学習-LLM-FineTuning (.bib)

          1. Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU

          機械学習-LLM-LLM-as-a-judge (.bib)

          1. Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
          2. Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
          3. Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao

          機械学習-LLM-MoE (.bib)

          1. Fedus, W., Zoph, B., & Shazeer, N. (2022). Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23(1), 120:5232–5120:5270.

          機械学習-LLM-RLHF (.bib)

          1. Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554

          機械学習-LLM-Training (.bib)

          1. The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
          2. Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
          3. Team, K., Du, A., Gao, B., Xing, B., Jiang, C., Chen, C., Li, C., Xiao, C., Du, C., Liao, C., Tang, C., Wang, C., Zhang, D., Yuan, E., Lu, E., Tang, F., Sung, F., Wei, G., Lai, G., … Lin, Z. (2025). Kimi k1.5: Scaling Reinforcement Learning with LLMs. arXiv.org. https://arxiv.org/abs/2501.12599v4

          機械学習-LLM-アラインメント (.bib)

          1. 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
          2. Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
          3. Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
          4. Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
          5. Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
          6. Guan, M. Y., Joglekar, M., Wallace, E., Jain, S., Barak, B., Helyar, A., Dias, R., Vallone, A., Ren, H., Wei, J., Chung, H. W., Toyer, S., Heidecke, J., Beutel, A., & Glaese, A. (2025). Deliberative alignment: reasoning enables safer language models. https://doi.org/10.48550/arXiv.2412.16339
          7. John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
          8. Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
          9. Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
          10. OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
          11. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
          12. Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
          13. Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
          14. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
          15. Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
          16. Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
          17. 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
          18. Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he

          機械学習-LLM-エージェント (.bib)

          1. 7群7編 分散協調とエージェント – 電子情報通信学会知識ベース. Retrieved July 5, 2025, from https://www.ieice-hbkb.org/portal/7%e7%be%a4%e3%80%80%e3%82%b3%e3%83%b3%e3%83%94%e3%83%a5%e3%83%bc%e3%82%bf-%ef%bd%bf%ef%be%8c%ef%be%84%ef%bd%b3%ef%bd%aa%ef%bd%b1/7%e7%be%a47%e7%b7%a8-%e5%88%86%e6%95%a3%e5%8d%94%e8%aa%bf%e3%81%a8%e3%82%a8%e3%83%bc%e3%82%b8%e3%82%a7%e3%83%b3%e3%83%88/
          2. 電子情報通信学会. (2019). 7群-7編-1章 エージェントの定義・モデル・概念. In 電子情報通信学会知識ベース (ver.1 ed., Number 7群7編). https://www.ieice-hbkb.org/files/ad_base/view_pdf.html?p=/files/07/07gun_07hen_01.pdf
          3. A generalist agent. (2022). Transactions on Machine Learning Research. https://openreview.net/forum?id=1ikK0kHjvj
          4. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X

          機械学習-LLM-スケーリング則 (.bib)

          1. Henighan, T., Kaplan, J., Katz, M., Chen, M., Hesse, C., Jackson, J., Jun, H., Brown, T. B., Dhariwal, P., Gray, S., Hallacy, C., Mann, B., Radford, A., Ramesh, A., Ryder, N., Ziegler, D. M., Schulman, J., Amodei, D., & McCandlish, S. (2020). Scaling laws for autoregressive generative modeling. https://doi.org/10.48550/arXiv.2010.14701
          2. Hernandez, D., Kaplan, J., Henighan, T., & McCandlish, S. (2021). Scaling laws for transfer. https://doi.org/10.48550/arXiv.2102.01293
          3. Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
          4. Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
          5. Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
          6. Muennighoff, N., Rush, A. M., Barak, B., Scao, T. L., Tazi, N., Piktus, A., Pyysalo, S., Wolf, T., & Raffel, C. (2023, November 2). Scaling data-constrained language models. https://openreview.net/forum?id=j5BuTrEj35
          7. Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
          8. Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
          9. Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
          10. Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
          11. Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/

          機械学習-LLM-プロンプティング (.bib)

          1. Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
          2. Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
          3. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.

          機械学習-LLM-事後学習-RL (.bib)

          1. Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., … Kaplan, J. (2022). Constitutional AI: harmlessness from AI feedback. https://doi.org/10.48550/arXiv.2212.08073
          2. Khan, F. (2025). FareedKhan-dev/train-deepseek-r1. https://github.com/FareedKhan-dev/train-deepseek-r1
          3. Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
          4. Wang, Y., Yang, Q., Zeng, Z., Ren, L., Liu, L., Peng, B., Cheng, H., He, X., Wang, K., Gao, J., Chen, W., Wang, S., Du, S. S., & Shen, Y. (2025). Reinforcement learning for reasoning in large language models with one training example. https://doi.org/10.48550/arXiv.2504.20571
          5. Wen, X., Liu, Z., Zheng, S., Xu, Z., Ye, S., Wu, Z., Liang, X., Wang, Y., Li, J., Miao, Z., Bian, J., & Yang, M. (2025). Reinforcement learning with verifiable rewards implicitly incentivizes correct reasoning in base LLMs. https://doi.org/10.48550/arXiv.2506.14245
          6. Zhao, X., Kang, Z., Feng, A., Levine, S., & Song, D. (2025). Learning to reason without external rewards. https://doi.org/10.48550/arXiv.2505.19590

          機械学習-LLM-事後学習-SFT (.bib)

            機械学習-LLM-事後学習-全般 (.bib)

            1. Lambert, N., Morrison, J., Pyatkin, V., Huang, S., Ivison, H., Brahman, F., Miranda, L. J. V., Liu, A., Dziri, N., Lyu, S., Gu, Y., Malik, S., Graf, V., Hwang, J. D., Yang, J., Bras, R. L., Tafjord, O., Wilhelm, C., Soldaini, L., … Hajishirzi, H. (2025). Tulu 3: Pushing Frontiers in Open Language Model Post-Training. https://doi.org/10.48550/arXiv.2411.15124

            機械学習-LLM-事後学習-選好チューニング (.bib)

            1. The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
            2. Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9

            機械学習-LLM-全般 (.bib)

            1. Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
            2. An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
            3. Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2019, September 25). The curious case of neural text degeneration. https://openreview.net/forum?id=rygGQyrFvH
            4. 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
            5. 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.

            機械学習-LLM-思考モデル (.bib)

            1. DeepSeek-AI, Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z. F., Gou, Z., Shao, Z., Li, Z., Gao, Z., … Zhang, Z. (2025). DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. https://doi.org/10.48550/arXiv.2501.12948
            2. Guan, M. Y., Joglekar, M., Wallace, E., Jain, S., Barak, B., Helyar, A., Dias, R., Vallone, A., Ren, H., Wei, J., Chung, H. W., Toyer, S., Heidecke, J., Beutel, A., & Glaese, A. (2025). Deliberative alignment: reasoning enables safer language models. https://doi.org/10.48550/arXiv.2412.16339
            3. Jin, M., Yu, Q., Shu, D., Zhao, H., Hua, W., Meng, Y., Zhang, Y., & Du, M. (2024). The impact of reasoning step length on large language models (L.-W. Ku, A. Martins, & V. Srikumar, Eds.; pp. 1830–1842). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.108
            4. Lee, K.-H., Fischer, I., Wu, Y.-H., Marwood, D., Baluja, S., Schuurmans, D., & Chen, X. (2025). Evolving deeper LLM thinking. https://doi.org/10.48550/arXiv.2501.09891
            5. Liu, R., Gao, J., Zhao, J., Zhang, K., Li, X., Qi, B., Ouyang, W., & Zhou, B. (2025). Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling. https://doi.org/10.48550/arXiv.2502.06703
            6. Muennighoff, N., Yang, Z., Shi, W., Li, X. L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E., & Hashimoto, T. (2025). s1: Simple test-time scaling. https://doi.org/10.48550/arXiv.2501.19393
            7. OpenAI. (2024). Learning to reason with LLMs. https://openai.com/index/learning-to-reason-with-llms/
            8. Rafailov, R., Hejna, J., Park, R., & Finn, C. (2024, August 26). From \r to {Q^*\: Your Language Model is Secretly a Q-Function. https://openreview.net/forum?id=kEVcNxtqXk
            9. Sardana, N., Portes, J., Doubov, S., & Frankle, J. (2024). Beyond Chinchilla-optimal: accounting for inference in language model scaling laws. Proceedings of the 41st International Conference on Machine Learning, 235, 43445–43460.
            10. Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. R., & Yao, S. (2023, November 2). Reflexion: language agents with verbal reinforcement learning. https://openreview.net/forum?id=vAElhFcKW6
            11. Singhal, S., Zeng, J., Bukharin, A., Zhang, Y., Shen, G., Mahabaleshwarkar, A. S., Kartal, B., Suhara, Y., Bercovich, A., Levy, I., Golan, I., Dabbah, M., El-Yaniv, R., Majumdar, S., Gitman, I., Bakhturina, E., Zhang, J. J., Su, B.-Y., Huang, G., … Konuk, T. (2025, June 21). Llama-Nemotron: Efficient Reasoning Models. https://openreview.net/forum?id=ev1xpo9mbI&referrer=%5Bthe%20profile%20of%20Olivier%20Delalleau%5D(%2Fprofile%3Fid%3D Olivier_Delalleau1)
            12. Snell, C. V., Lee, J., Xu, K., & Kumar, A. (2024, October 4). Scaling LLM test-time compute optimally can be more effective than scaling parameters for reasoning. https://openreview.net/forum?id=4FWAwZtd2n
            13. Wang, X., & Zhou, D. (2024, November 6). Chain-of-Thought Reasoning Without Prompting. https://openreview.net/forum?id=4Zt7S0B0Jp
            14. Wu, T., Lan, J., Yuan, W., Jiao, J., Weston, J. E., & Sukhbaatar, S. (2025, June 18). Thinking LLMs: general instruction following with thought generation. https://openreview.net/forum?id=z6SrgYCdey&noteId=t3y0Ev0lm6
            15. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
            16. Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. R. (2023, November 2). Tree of Thoughts: deliberate problem solving with large language models. https://openreview.net/forum?id=5Xc1ecxO1h
            17. Zelikman, E., Harik, G. R., Shao, Y., Jayasiri, V., Haber, N., & Goodman, N. (2024, August 26). Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking. https://openreview.net/forum?id=oRXPiSOGH9
            18. Zelikman, E., Wu, Y., Mu, J., & Goodman, N. (2022, October 31). STaR: bootstrapping reasoning with reasoning. https://openreview.net/forum?id=_3ELRdg2sgI

            機械学習-LLM-推論スケーリング (.bib)

            1. Yue, Z., Zhuang, H., Bai, A., Hui, K., Jagerman, R., Zeng, H., Qin, Z., Wang, D., Wang, X., & Bendersky, M. (2024, October 4). Inference Scaling for Long-Context Retrieval Augmented Generation. https://openreview.net/forum?id=FSjIrOm1vz

            機械学習-LLM-文脈内学習 (.bib)

            1. Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
            2. Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
            3. Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
            4. Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ

            機械学習-LLM-評価 (.bib)

            1. Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387

            機械学習-LLM (.bib)

            1. 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
            2. 7群7編 分散協調とエージェント – 電子情報通信学会知識ベース. Retrieved July 5, 2025, from https://www.ieice-hbkb.org/portal/7%e7%be%a4%e3%80%80%e3%82%b3%e3%83%b3%e3%83%94%e3%83%a5%e3%83%bc%e3%82%bf-%ef%bd%bf%ef%be%8c%ef%be%84%ef%bd%b3%ef%bd%aa%ef%bd%b1/7%e7%be%a47%e7%b7%a8-%e5%88%86%e6%95%a3%e5%8d%94%e8%aa%bf%e3%81%a8%e3%82%a8%e3%83%bc%e3%82%b8%e3%82%a7%e3%83%b3%e3%83%88/
            3. Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
            4. Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
            5. Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., … Kaplan, J. (2022). Constitutional AI: harmlessness from AI feedback. https://doi.org/10.48550/arXiv.2212.08073
            6. Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T., & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. https://doi.org/10.48550/arXiv.2303.12712
            7. Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
            8. DeepSeek-AI, Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z. F., Gou, Z., Shao, Z., Li, Z., Gao, Z., … Zhang, Z. (2025). DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. https://doi.org/10.48550/arXiv.2501.12948
            9. Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU
            10. 電子情報通信学会. (2019). 7群-7編-1章 エージェントの定義・モデル・概念. In 電子情報通信学会知識ベース (ver.1 ed., Number 7群7編). https://www.ieice-hbkb.org/files/ad_base/view_pdf.html?p=/files/07/07gun_07hen_01.pdf
            11. Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
            12. The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
            13. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: an early look at the labor market impact potential of large language models. https://doi.org/10.48550/arXiv.2303.10130
            14. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2024). GPTs are GPTs: Labor market impact potential of LLMs. Science, 384(6702), 1306–1308. https://doi.org/10.1126/science.adj0998
            15. Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
            16. Fedus, W., Zoph, B., & Shazeer, N. (2022). Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23(1), 120:5232–5120:5270.
            17. Gemini Team, Google. (2025). Gemini 2.5: pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities. https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf
            18. Guan, M. Y., Joglekar, M., Wallace, E., Jain, S., Barak, B., Helyar, A., Dias, R., Vallone, A., Ren, H., Wei, J., Chung, H. W., Toyer, S., Heidecke, J., Beutel, A., & Glaese, A. (2025). Deliberative alignment: reasoning enables safer language models. https://doi.org/10.48550/arXiv.2412.16339
            19. Hayou, S., Ghosh, N., & Yu, B. (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proceedings of the 41st International Conference on Machine Learning, 17783–17806. https://proceedings.mlr.press/v235/hayou24a.html
            20. 横井 祥. (2024). 「確率的なオウム」にできること、またそれがなぜできるのかについて. https://speakerdeck.com/eumesy/language-models-as-modern-version-of-the-use-theory-of-meaning
            21. Henighan, T., Kaplan, J., Katz, M., Chen, M., Hesse, C., Jackson, J., Jun, H., Brown, T. B., Dhariwal, P., Gray, S., Hallacy, C., Mann, B., Radford, A., Ramesh, A., Ryder, N., Ziegler, D. M., Schulman, J., Amodei, D., & McCandlish, S. (2020). Scaling laws for autoregressive generative modeling. https://doi.org/10.48550/arXiv.2010.14701
            22. Hernandez, D., Kaplan, J., Henighan, T., & McCandlish, S. (2021). Scaling laws for transfer. https://doi.org/10.48550/arXiv.2102.01293
            23. An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
            24. Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
            25. Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2019, September 25). The curious case of neural text degeneration. https://openreview.net/forum?id=rygGQyrFvH
            26. Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2022). LoRA: Low-rank adaptation of large language models. International Conference on Learning Representations. https://openreview.net/forum?id=nZeVKeeFYf9
            27. Jin, M., Yu, Q., Shu, D., Zhao, H., Hua, W., Meng, Y., Zhang, Y., & Du, M. (2024). The impact of reasoning step length on large language models (L.-W. Ku, A. Martins, & V. Srikumar, Eds.; pp. 1830–1842). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.108
            28. John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
            29. Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
            30. Khan, F. (2025). FareedKhan-dev/train-deepseek-r1. https://github.com/FareedKhan-dev/train-deepseek-r1
            31. Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
            32. Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
            33. Lambert, N., Morrison, J., Pyatkin, V., Huang, S., Ivison, H., Brahman, F., Miranda, L. J. V., Liu, A., Dziri, N., Lyu, S., Gu, Y., Malik, S., Graf, V., Hwang, J. D., Yang, J., Bras, R. L., Tafjord, O., Wilhelm, C., Soldaini, L., … Hajishirzi, H. (2025). Tulu 3: Pushing Frontiers in Open Language Model Post-Training. https://doi.org/10.48550/arXiv.2411.15124
            34. Learn Prompting: Your Guide to Communicating with AI. (2023). https://learnprompting.org/
            35. Lee, K.-H., Fischer, I., Wu, Y.-H., Marwood, D., Baluja, S., Schuurmans, D., & Chen, X. (2025). Evolving deeper LLM thinking. https://doi.org/10.48550/arXiv.2501.09891
            36. Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
            37. Liu, R., Gao, J., Zhao, J., Zhang, K., Li, X., Qi, B., Ouyang, W., & Zhou, B. (2025). Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling. https://doi.org/10.48550/arXiv.2502.06703
            38. Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
            39. Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
            40. Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
            41. Muennighoff, N., Yang, Z., Shi, W., Li, X. L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E., & Hashimoto, T. (2025). s1: Simple test-time scaling. https://doi.org/10.48550/arXiv.2501.19393
            42. Muennighoff, N., Rush, A. M., Barak, B., Scao, T. L., Tazi, N., Piktus, A., Pyysalo, S., Wolf, T., & Raffel, C. (2023, November 2). Scaling data-constrained language models. https://openreview.net/forum?id=j5BuTrEj35
            43. OpenAI Platform. Retrieved August 29, 2024, from https://platform.openai.com
            44. OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
            45. OpenAI. (2024). Learning to reason with LLMs. https://openai.com/index/learning-to-reason-with-llms/
            46. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
            47. Prompt Engineering Guide – Nextra. Retrieved August 29, 2024, from https://www.promptingguide.ai/
            48. Rafailov, R., Hejna, J., Park, R., & Finn, C. (2024, August 26). From \r to {Q^*\: Your Language Model is Secretly a Q-Function. https://openreview.net/forum?id=kEVcNxtqXk
            49. Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
            50. Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
            51. A generalist agent. (2022). Transactions on Machine Learning Research. https://openreview.net/forum?id=1ikK0kHjvj
            52. Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
            53. Sardana, N., Portes, J., Doubov, S., & Frankle, J. (2024). Beyond Chinchilla-optimal: accounting for inference in language model scaling laws. Proceedings of the 41st International Conference on Machine Learning, 235, 43445–43460.
            54. Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
            55. Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., Li, Y., Gupta, A., Han, H. J., Schulhoff, S., Dulepet, P. S., Vidyadhara, S., Ki, D., Agrawal, S., Pham, C., Kroiz, G., Li, F., Tao, H., Srivastava, A., … Resnik, P. (2024). The Prompt Report: A Systematic Survey of Prompting Techniques. http://arxiv.org/abs/2406.06608
            56. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
            57. 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
            58. 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
            59. Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. R., & Yao, S. (2023, November 2). Reflexion: language agents with verbal reinforcement learning. https://openreview.net/forum?id=vAElhFcKW6
            60. Singhal, S., Zeng, J., Bukharin, A., Zhang, Y., Shen, G., Mahabaleshwarkar, A. S., Kartal, B., Suhara, Y., Bercovich, A., Levy, I., Golan, I., Dabbah, M., El-Yaniv, R., Majumdar, S., Gitman, I., Bakhturina, E., Zhang, J. J., Su, B.-Y., Huang, G., … Konuk, T. (2025, June 21). Llama-Nemotron: Efficient Reasoning Models. https://openreview.net/forum?id=ev1xpo9mbI&referrer=%5Bthe%20profile%20of%20Olivier%20Delalleau%5D(%2Fprofile%3Fid%3D Olivier_Delalleau1)
            61. Snell, C. V., Lee, J., Xu, K., & Kumar, A. (2024, October 4). Scaling LLM test-time compute optimally can be more effective than scaling parameters for reasoning. https://openreview.net/forum?id=4FWAwZtd2n
            62. Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
            63. Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
            64. Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
            65. Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
            66. Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
            67. Wang, X., & Zhou, D. (2024, November 6). Chain-of-Thought Reasoning Without Prompting. https://openreview.net/forum?id=4Zt7S0B0Jp
            68. Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387
            69. Wang, Y., Yang, Q., Zeng, Z., Ren, L., Liu, L., Peng, B., Cheng, H., He, X., Wang, K., Gao, J., Chen, W., Wang, S., Du, S. S., & Shen, Y. (2025). Reinforcement learning for reasoning in large language models with one training example. https://doi.org/10.48550/arXiv.2504.20571
            70. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.
            71. Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
            72. Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ
            73. Wen, X., Liu, Z., Zheng, S., Xu, Z., Ye, S., Wu, Z., Liang, X., Wang, Y., Li, J., Miao, Z., Bian, J., & Yang, M. (2025). Reinforcement learning with verifiable rewards implicitly incentivizes correct reasoning in base LLMs. https://doi.org/10.48550/arXiv.2506.14245
            74. Wu, T., Lan, J., Yuan, W., Jiao, J., Weston, J. E., & Sukhbaatar, S. (2025, June 18). Thinking LLMs: general instruction following with thought generation. https://openreview.net/forum?id=z6SrgYCdey&noteId=t3y0Ev0lm6
            75. 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
            76. Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
            77. Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
            78. Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/
            79. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
            80. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
            81. Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. R. (2023, November 2). Tree of Thoughts: deliberate problem solving with large language models. https://openreview.net/forum?id=5Xc1ecxO1h
            82. Yue, Z., Zhuang, H., Bai, A., Hui, K., Jagerman, R., Zeng, H., Qin, Z., Wang, D., Wang, X., & Bendersky, M. (2024, October 4). Inference Scaling for Long-Context Retrieval Augmented Generation. https://openreview.net/forum?id=FSjIrOm1vz
            83. Zelikman, E., Harik, G. R., Shao, Y., Jayasiri, V., Haber, N., & Goodman, N. (2024, August 26). Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking. https://openreview.net/forum?id=oRXPiSOGH9
            84. Zelikman, E., Wu, Y., Mu, J., & Goodman, N. (2022, October 31). STaR: bootstrapping reasoning with reasoning. https://openreview.net/forum?id=_3ELRdg2sgI
            85. Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
            86. Zhao, X., Kang, Z., Feng, A., Levine, S., & Song, D. (2025). Learning to reason without external rewards. https://doi.org/10.48550/arXiv.2505.19590
            87. Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao
            88. Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he

            機械学習-LLMs-FineTuning (.bib)

            1. Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU

            機械学習-LLMs-LLM-as-a-judge (.bib)

            1. Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
            2. Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
            3. Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao

            機械学習-LLMs-RLHF (.bib)

            1. Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554

            機械学習-LLMs-Training (.bib)

            1. The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
            2. Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9

            機械学習-LLMs-アラインメント (.bib)

            1. 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
            2. Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
            3. Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
            4. Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
            5. Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
            6. John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
            7. Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
            8. Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
            9. OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
            10. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
            11. Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
            12. Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
            13. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
            14. Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
            15. Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
            16. 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
            17. Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he

            機械学習-LLMs-スケーリング則 (.bib)

            1. Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
            2. Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
            3. Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
            4. Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
            5. Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
            6. Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
            7. Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
            8. Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/

            機械学習-LLMs-プロンプティング (.bib)

            1. Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
            2. Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
            3. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.

            機械学習-LLMs-全般 (.bib)

            1. Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
            2. An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
            3. 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
            4. 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.

            機械学習-LLMs-文脈内学習 (.bib)

            1. Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
            2. Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
            3. Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
            4. Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ

            機械学習-LLMs-評価 (.bib)

            1. Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387

            機械学習-LLMs (.bib)

            1. 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
            2. Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
            3. Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
            4. Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T., & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. https://doi.org/10.48550/arXiv.2303.12712
            5. Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
            6. Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU
            7. Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
            8. The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
            9. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models. https://doi.org/10.48550/arXiv.2303.10130
            10. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2024). GPTs are GPTs: Labor market impact potential of LLMs. Science, 384(6702), 1306–1308. https://doi.org/10.1126/science.adj0998
            11. Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
            12. Hayou, S., Ghosh, N., & Yu, B. (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proceedings of the 41st International Conference on Machine Learning, 17783–17806. https://proceedings.mlr.press/v235/hayou24a.html
            13. 横井 祥. (2024). 「確率的なオウム」にできること、またそれがなぜできるのかについて. https://speakerdeck.com/eumesy/language-models-as-modern-version-of-the-use-theory-of-meaning
            14. An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
            15. Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
            16. Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2022). LoRA: Low-rank adaptation of large language models. International Conference on Learning Representations. https://openreview.net/forum?id=nZeVKeeFYf9
            17. John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
            18. Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
            19. Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
            20. Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
            21. Learn Prompting: Your Guide to Communicating with AI. (2023). https://learnprompting.org/
            22. Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
            23. Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
            24. Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
            25. Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
            26. OpenAI Platform. Retrieved August 29, 2024, from https://platform.openai.com
            27. OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
            28. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
            29. Prompt Engineering Guide – Nextra. Retrieved August 29, 2024, from https://www.promptingguide.ai/
            30. Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
            31. Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
            32. Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
            33. Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
            34. Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., Li, Y., Gupta, A., Han, H. J., Schulhoff, S., Dulepet, P. S., Vidyadhara, S., Ki, D., Agrawal, S., Pham, C., Kroiz, G., Li, F., Tao, H., Srivastava, A., … Resnik, P. (2024). The Prompt Report: A Systematic Survey of Prompting Techniques. http://arxiv.org/abs/2406.06608
            35. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
            36. 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
            37. 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
            38. Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
            39. Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
            40. Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
            41. Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
            42. Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
            43. Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387
            44. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.
            45. Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
            46. Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ
            47. 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
            48. Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
            49. Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
            50. Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/
            51. Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
            52. Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao
            53. Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he

            機械学習-Metrics (.bib)

              機械学習-タスク-分析・発見系-クラスタリング (.bib)

              1. Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
              2. Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
              3. C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
              4. Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
              5. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. (2022). Engineering Applications of Artificial Intelligence, 110, 104743. https://doi.org/10.1016/j.engappai.2022.104743
              6. Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining.
              7. Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
              8. Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
              9. Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
              10. Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
              11. Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst., 42(3), 19:1–19:21. https://doi.org/10.1145/3068335
              12. SIGKDD News: 2014 SIGKDD Test of Time Award. (2014). https://www.kdd.org/News/view/2014-sigkdd-test-of-time-award
              13. Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845
              14. Xu, D., & Tian, Y. (2015). A Comprehensive Survey of Clustering Algorithms. Annals of Data Science, 2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1

              機械学習-タスク-分析・発見系-スパース表現獲得 (.bib)

              1. Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
              2. 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/

              機械学習-タスク-分析・発見系-全般 (.bib)

              1. Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
              2. 石井 健一郎, & 上田 修功. (2014). 教師なし学習入門 (わかりやすいパターン認識 続). オーム社.

              機械学習-タスク-分析・発見系-変数選択 (.bib)

              1. Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
              2. Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406

              機械学習-タスク-分析・発見系-次元削減 (.bib)

              1. Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
              2. Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
              3. Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT

              機械学習-タスク-分析・発見系-異常検知 (.bib)

              1. Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
              2. Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
              3. Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
              4. Pelletier, B. (2024). On the statistical properties of the isolation forest anomaly detection method. https://hal.science/hal-04430185

              機械学習-タスク-分析・発見系-選択的推論 (.bib)

              1. Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915

              機械学習-タスク-分析・発見系 (.bib)

              1. Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
              2. Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
              3. Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
              4. C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
              5. Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
              6. Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
              7. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. (2022). Engineering Applications of Artificial Intelligence, 110, 104743. https://doi.org/10.1016/j.engappai.2022.104743
              8. Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
              9. Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining.
              10. Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
              11. 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/
              12. Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
              13. Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
              14. Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
              15. Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
              16. Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
              17. Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915
              18. Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
              19. Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT
              20. Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
              21. Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
              22. Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406
              23. Pelletier, B. (2024). On the statistical properties of the isolation forest anomaly detection method. https://hal.science/hal-04430185
              24. Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst., 42(3), 19:1–19:21. https://doi.org/10.1145/3068335
              25. 石井 健一郎, & 上田 修功. (2014). 教師なし学習入門 (わかりやすいパターン認識 続). オーム社.
              26. SIGKDD News: 2014 SIGKDD Test of Time Award. (2014). https://www.kdd.org/News/view/2014-sigkdd-test-of-time-award
              27. Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845
              28. Xu, D., & Tian, Y. (2015). A Comprehensive Survey of Clustering Algorithms. Annals of Data Science, 2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1

              機械学習-タスク-生成系 (.bib)

              1. Chan, C., Ginosar, S., Zhou, T., & Efros, A. (2019). Everybody dance now. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 5932–5941. https://doi.org/10.1109/ICCV.2019.00603
              2. Dathathri, S., See, A., Ghaisas, S., Huang, P.-S., McAdam, R., Welbl, J., Bachani, V., Kaskasoli, A., Stanforth, R., Matejovicova, T., Hayes, J., Vyas, N., Merey, M. A., Brown-Cohen, J., Bunel, R., Balle, B., Cemgil, T., Ahmed, Z., Stacpoole, K., … Kohli, P. (2024). Scalable watermarking for identifying large language model outputs. Nature, 634(8035), 818–823. https://doi.org/10.1038/s41586-024-08025-4
              3. Gottesman, Y. (2023). Understand diffusion models with VAEs. Yoni Gottesman. https://yonigottesman.github.io/2023/03/11/vae.html
              4. Ho, J., Jain, A., & Abbeel, P. Denoising diffusion probabilistic models.
              5. Song, Y., & Ermon, S. (2019). Generative modeling by estimating gradients of the data distribution. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1067, pp. 11918–11930). Curran Associates Inc.
              6. Song, Y. (2021). Generative modeling by estimating gradients of the data distribution. Yang Song. https://yang-song.net/blog/2021/score
              7. Song, Y., Durkan, C., Murray, I., & Ermon, S. (2021, November 9). Maximum likelihood training of score-based diffusion models. https://openreview.net/forum?id=AklttWFnxS9
              8. Vincent, P. (2011). A connection between score matching and denoising autoencoders. Neural Computation, 23(7), 1661–1674. https://doi.org/10.1162/NECO_a_00142

              機械学習-タスク−分析・発見系-クラスタリング (.bib)

              1. Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
              2. Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
              3. C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
              4. Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
              5. Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
              6. Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
              7. Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
              8. Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
              9. Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845

              機械学習-タスク−分析・発見系-スパース表現獲得 (.bib)

              1. Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
              2. 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/

              機械学習-タスク−分析・発見系-全般 (.bib)

              1. Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.

              機械学習-タスク−分析・発見系-変数選択 (.bib)

              1. Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
              2. Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406

              機械学習-タスク−分析・発見系-次元削減 (.bib)

              1. Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
              2. Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
              3. Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT

              機械学習-タスク−分析・発見系-異常検知 (.bib)

              1. Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
              2. Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
              3. Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363

              機械学習-タスク−分析・発見系-選択的推論 (.bib)

              1. Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915

              機械学習-タスク−分析・発見系 (.bib)

              1. Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
              2. Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
              3. Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
              4. C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
              5. Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
              6. Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
              7. Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
              8. Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
              9. 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/
              10. Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
              11. Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
              12. Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
              13. Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
              14. Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
              15. Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915
              16. Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
              17. Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT
              18. Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
              19. Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
              20. Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406
              21. Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845

              機械学習-データの性質の仮説 (.bib)

                機械学習-トピックス-Conformal prediction (.bib)

                1. Angelopoulos, A. N., & Bates, S. (2022). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. http://arxiv.org/abs/2107.07511
                2. Vovk, V., Gammerman, A., & Shafer, G. (2022). Algorithmic Learning in a Random World. Springer International Publishing. https://doi.org/10.1007/978-3-031-06649-8

                機械学習-トピックス (.bib)

                1. Angelopoulos, A. N., & Bates, S. (2022). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. http://arxiv.org/abs/2107.07511
                2. Vovk, V., Gammerman, A., & Shafer, G. (2022). Algorithmic Learning in a Random World. Springer International Publishing. https://doi.org/10.1007/978-3-031-06649-8

                機械学習-ドメイン-テキスト (.bib)

                1. Ali, M., Fromm, M., Thellmann, K., Rutmann, R., Lübbering, M., Leveling, J., Klug, K., Ebert, J., Doll, N., Buschhoff, J., Jain, C., Weber, A., Jurkschat, L., Abdelwahab, H., John, C., Ortiz Suarez, P., Ostendorff, M., Weinbach, S., Sifa, R., … Flores-Herr, N. (2024). Tokenizer choice for LLM training: negligible or crucial? In K. Duh, H. Gomez, & S. Bethard (Eds.), Findings of the Association for Computational Linguistics: NAACL 2024 (pp. 3907–3924). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-naacl.247
                2. Arnett, C., Chang, T. A., & Bergen, B. (2024). A Bit of a Problem: Measurement Disparities in Dataset Sizes across Languages. In M. Melero, S. Sakti, & C. Soria (Eds.), Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024 (pp. 1–9). ELRA and ICCL. https://aclanthology.org/2024.sigul-1.1/
                3. Arnett, C., & Bergen, B. (2025). Why do language models perform worse for morphologically complex languages? In O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, & S. Schockaert (Eds.), Proceedings of the 31st International Conference on Computational Linguistics (pp. 6607–6623). Association for Computational Linguistics. https://aclanthology.org/2025.coling-main.441/
                4. Arora, S., Liang, Y., & Ma, T. (2017, February 6). A simple but tough-to-beat baseline for sentence embeddings. https://openreview.net/forum?id=SyK00v5xx
                5. Chen, S., Wong, S., Chen, L., & Tian, Y. (2023). Extending context window of large language models via positional interpolation. https://doi.org/10.48550/arXiv.2306.15595
                6. Distributional Hypothesis - ACL Wiki. Retrieved June 22, 2025, from https://www.aclweb.org/aclwiki/index.php?title=Distributional_Hypothesis
                7. Dutta, D., Ansari, F., Chakrabarty, A., & Das, S. (2025). On the existence of universal simulators of attention. https://doi.org/10.48550/arXiv.2506.18739
                8. Hutchins, J. (1995). "The whisky was invisible", or Persistent myths of MT. MT News International, 11. https://web.archive.org/web/20210103041306/http://www.hutchinsweb.me.uk/MTNI-11-1995.pdf
                9. Kallini, J., Papadimitriou, I., Futrell, R., Mahowald, K., & Potts, C. (2024). Mission: Impossible language models. In L.-W. Ku, A. Martins, & V. Srikumar (Eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 14691–14714). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-long.787
                10. Lu, Y., & Morgan, J. L. (2020). Homophone auditory processing in cross-linguistic perspective. Proceedings of the Linguistic Society of America, 5(1), 529–542. https://doi.org/10.3765/plsa.v5i1.4733
                11. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, 2, 3111–3119.
                12. 坪井 祐太, 海野 裕也, & 鈴木 潤. (2017). 深層学習による自然言語処理. 講談社.
                13. Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr., 3(4), 333–389. https://doi.org/10.1561/1500000019
                14. R, F. I. R. T. H. J. (1957). A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis. https://cir.nii.ac.jp/crid/1570854175539816192?lang=en
                15. 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
                16. 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
                17. Speech and Language Processing. Retrieved June 18, 2025, from https://web.stanford.edu/ jurafsky/slp3/
                18. Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., & Liu, Y. (2024). RoFormer: enhanced transformer with rotary position embedding. Neurocomputing, 568(C). https://doi.org/10.1016/j.neucom.2023.127063
                19. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000–6010.
                20. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (T. Linzen, G. Chrupała, & A. Alishahi, Eds.; pp. 353–355). Association for Computational Linguistics. https://doi.org/10.18653/v1/W18-5446
                21. Warner, B., Chaffin, A., Clavié, B., Weller, O., Hallström, O., Taghadouini, S., Gallagher, A., Biswas, R., Ladhak, F., Aarsen, T., Cooper, N., Adams, G., Howard, J., & Poli, I. (2024). Smarter, better, faster, longer: a modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference. https://doi.org/10.48550/arXiv.2412.13663
                22. Wettig, A., Gao, T., Zhong, Z., & Chen, D. (2023). Should you mask 15% in masked language modeling? In A. Vlachos & I. Augenstein (Eds.), Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (pp. 2985–3000). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.eacl-main.217
                23. Xiong, R., Yang, Y., He, D., Zheng, K., Zheng, S., Xing, C., Zhang, H., Lan, Y., Wang, L., & Liu, T.-Y. (2020). On layer normalization in the transformer architecture. Proceedings of the 37th International Conference on Machine Learning, 119, 10524–10533.
                24. 岩田 具治. (2015). トピックモデル. 講談社.
                25. Zhang, Y., & Teng, Z. (2021). Natural Language Processing: A Machine Learning Perspective. Cambridge University Press.
                26. Zhang, B., & Sennrich, R. (2019). Root mean square layer normalization. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1110, pp. 12381–12392). Curran Associates Inc.

                機械学習-ドメイン-ネットワーク (.bib)

                1. Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111. https://doi.org/10.1103/PhysRevE.70.066111
                2. Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826. https://doi.org/10.1073/pnas.122653799
                3. Newman, M. E. J. (2004). Fast algorithm for detecting community structure in networks. Physical Review E, 69(6), 066133. https://doi.org/10.1103/PhysRevE.69.066133
                4. Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113. https://doi.org/10.1103/PhysRevE.69.026113
                5. Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
                6. Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
                7. Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33(4), 452–473. http://www.jstor.org/stable/3629752
                8. 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.

                機械学習-ドメイン-マルチモーダル (.bib)

                1. Be My Eyes (Ed.). (2024). Be My Eyes Accessibility with GPT-4o. https://www.youtube.com/watch?v=Zq710AKC1gg
                2. Fu, L., Yang, B., Kuang, Z., Song, J., Li, Y., Zhu, L., Luo, Q., Wang, X., Lu, H., Huang, M., Li, Z., Tang, G., Shan, B., Lin, C., Liu, Q., Wu, B., Feng, H., Liu, H., Huang, C., … Bai, X. (2024). OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning (Version 1). https://doi.org/10.48550/arXiv.2501.00321
                3. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learning, 8748–8763. https://proceedings.mlr.press/v139/radford21a.html
                4. Neural discrete representation learning. (2017). Proceedings of the 31st International Conference on Neural Information Processing Systems, 6309–6318.
                5. Xu, H., Xie, S., Tan, X., Huang, P.-Y., Howes, R., Sharma, V., Li, S.-W., Ghosh, G., Zettlemoyer, L., & Feichtenhofer, C. (2023, October 13). Demystifying CLIP data. https://openreview.net/forum?id=5BCFlnfE1g
                6. Zhou, C., Yu, L., Babu, A., Tirumala, K., Yasunaga, M., Shamis, L., Kahn, J., Ma, X., Zettlemoyer, L., & Levy, O. (2024, October 4). Transfusion: predict the next token and diffuse images with one multi-modal model. https://openreview.net/forum?id=SI2hI0frk6

                機械学習-ドメイン-化学 (.bib)

                1. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2

                機械学習-ドメイン-画像 (.bib)

                1. Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html
                2. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. https://doi.org/10.1109/CVPR.2009.5206848
                3. drhead. (2024). The VAE used for Stable Diffusion 1.x/2.x and other models (KL-F8) has a critical flaw, probably due to bad training, that is holding back all models that use it (almost certainly including DALL-E 3). [Reddit Post]. r/StableDiffusion. https://www.reddit.com/r/StableDiffusion/comments/1ag5h5s/the_vae_used_for_stable_diffusion_1x2x_and_other/
                4. Entezari, R., Wortsman, M., Saukh, O., Shariatnia, M. M., Sedghi, H., & Schmidt, L. (2023). The role of pre-training data in transfer learning. https://doi.org/10.48550/arXiv.2302.13602
                5. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018, September 27). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. https://openreview.net/forum?id=Bygh9j09KX
                6. Hermann, K., Mobahi, H., Fel, T., & Mozer, M. C. (2023, October 13). On the foundations of shortcut learning. https://openreview.net/forum?id=Tj3xLVuE9f
                7. Jocher, G., Qiu, J., & Chaurasia, A. (2023). Ultralytics YOLO (Version 8.0.0). https://github.com/ultralytics/ultralytics
                8. Kim, J. (2025). kjsman/stable-diffusion-pytorch. https://github.com/kjsman/stable-diffusion-pytorch
                9. Odena, A., Dumoulin, V., & Olah, C. (2016). Deconvolution and checkerboard artifacts. Distill, 1(10), e3. https://doi.org/10.23915/distill.00003
                10. Plesner, A., Vontobel, T., & Wattenhofer, R. (2024). Breaking reCAPTCHAv2. https://doi.org/10.1109/COMPSAC61105.2024.00142
                11. Raghu, M., Zhang, C., Kleinberg, J., & Bengio, S. (2019). Transfusion: understanding transfer learning for medical imaging. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 301, pp. 3347–3357). Curran Associates Inc.
                12. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. 10674–10685. https://doi.org/10.1109/CVPR52688.2022.01042
                13. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (pp. 234–241). Springer International Publishing. https://doi.org/10.1007/978-3-319-24574-4_28
                14. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252. https://doi.org/10.1007/s11263-015-0816-y
                15. Sharif, M., Bhagavatula, S., Bauer, L., & Reiter, M. K. (2016). Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 1528–1540. https://doi.org/10.1145/2976749.2978392
                16. Szeliski, R. (2022). Computer Vision: Algorithms and Applications. Springer. https://szeliski.org/Book/
                17. THU-MIG/yolov10. (2024). THU-MIG. https://github.com/THU-MIG/yolov10
                18. Ultralytics. Ultralytics YOLO11 object detection model. Retrieved June 12, 2025, from https://github.com/ultralytics/ultralytics/blob/da98efc61d9e0467315fc86c2297c8d81e656b1a/ultralytics/cfg/models/11/yolo11.yaml
                19. Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. https://doi.org/10.48550/arXiv.2405.14458
                20. Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. Advances in Neural Information Processing Systems, 37, 107984–108011. https://proceedings.neurips.cc/paper_files/paper/2024/hash/c34ddd05eb089991f06f3c5dc36836e0-Abstract-Conference.html
                21. Wu, Y., & He, K. (2018). Group normalization. Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XIII, 3–19. https://doi.org/10.1007/978-3-030-01261-8_1
                22. 原田 達也. (2017). 画像認識. 講談社. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003372347

                機械学習-モデル - ニューラルネットワーク (.bib)

                1. Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2021). Sharpness-Aware Minimization for Efficiently Improving Generalization. https://doi.org/10.48550/arXiv.2010.01412
                2. Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. https://doi.org/10.48550/arXiv.1207.0580

                機械学習-モデル-k最近傍法 (.bib)

                1. Abu Alfeilat, H. A., Hassanat, A. B. A., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., Eyal Salman, H. S., & Prasath, V. B. S. (2019). Effects of distance measure choice on K-nearest neighbor classifier performance: a review. Big Data, 7(4), 221–248. https://doi.org/10.1089/big.2018.0175
                2. Aumüller, M., Bernhardsson, E., & Faithfull, A. (2018). ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. https://doi.org/10.48550/arXiv.1807.05614
                3. Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. https://doi.org/10.48550/arXiv.1603.09320
                4. Parry, R. M., Jones, W., Stokes, T. H., Phan, J. H., Moffitt, R. A., Fang, H., Shi, L., Oberthuer, A., Fischer, M., Tong, W., & Wang, M. D. (2010). k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. The Pharmacogenomics Journal, 10(4), 292–309. https://doi.org/10.1038/tpj.2010.56
                5. Todeschini, R., Ballabio, D., & Consonni, V. (2020). Distances and similarity measures in chemometrics and chemoinformatics. In Encyclopedia of Analytical Chemistry (pp. 1–40). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470027318.a9438.pub2

                機械学習-モデル-ニューラルネットワーク (.bib)

                1. Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2021). Sharpness-Aware Minimization for Efficiently Improving Generalization. https://doi.org/10.48550/arXiv.2010.01412
                2. Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. Proceedings of the 34th International Conference on Machine Learning - Volume 70, 1321–1330.
                3. Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. https://doi.org/10.48550/arXiv.1207.0580
                4. Oikarinen, T., & Weng, T.-W. (2022, September 29). CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks. https://openreview.net/forum?id=iPWiwWHc1V
                5. Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. https://doi.org/10.48550/arXiv.1611.03530

                機械学習-モデル-ノンパラメトリックモデル (.bib)

                1. Belkin, M., Rakhlin, A., & Tsybakov, A. B. (2019). Does data interpolation contradict statistical optimality? The 22nd International Conference on Artificial Intelligence and Statistics, 1611–1619. http://proceedings.mlr.press/v89/belkin19a.html

                機械学習-モデル-線型モデル (.bib)

                1. Seber, G. A. F., & Lee, A. J. (2003). Linear Regression Analysis (2nd ed.). Wiley.

                機械学習-モデル-過剰パラメーターモデル (.bib)

                1. Belkin, M., Rakhlin, A., & Tsybakov, A. B. (2019). Does data interpolation contradict statistical optimality? The 22nd International Conference on Artificial Intelligence and Statistics, 1611–1619. http://proceedings.mlr.press/v89/belkin19a.html
                2. Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018). Visualizing the loss landscape of neural nets. Advances in Neural Information Processing Systems, 31. https://proceedings.neurips.cc/paper_files/paper/2018/hash/a41b3bb3e6b050b6c9067c67f663b915-Abstract.html

                機械学習-モデル (.bib)

                1. Abu Alfeilat, H. A., Hassanat, A. B. A., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., Eyal Salman, H. S., & Prasath, V. B. S. (2019). Effects of distance measure choice on K-nearest neighbor classifier performance: a review. Big Data, 7(4), 221–248. https://doi.org/10.1089/big.2018.0175
                2. Aumüller, M., Bernhardsson, E., & Faithfull, A. (2018). ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. https://doi.org/10.48550/arXiv.1807.05614
                3. Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Y. Bengio & Y. LeCun (Eds.), Conference Track Proceedings of the 3rd International Conference on Learning Representations. http://arxiv.org/abs/1409.0473
                4. Belkin, M., Rakhlin, A., & Tsybakov, A. B. (2019). Does data interpolation contradict statistical optimality? The 22nd International Conference on Artificial Intelligence and Statistics, 1611–1619. http://proceedings.mlr.press/v89/belkin19a.html
                5. Bridle, J. S. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. F. Soulié & J. Hérault (Eds.), Neurocomputing (pp. 227–236). Springer. https://doi.org/10.1007/978-3-642-76153-9_28
                6. Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2021). Sharpness-Aware Minimization for Efficiently Improving Generalization. https://doi.org/10.48550/arXiv.2010.01412
                7. Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. Proceedings of the 34th International Conference on Machine Learning - Volume 70, 1321–1330.
                8. Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. https://doi.org/10.48550/arXiv.1207.0580
                9. Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018). Visualizing the loss landscape of neural nets. Advances in Neural Information Processing Systems, 31. https://proceedings.neurips.cc/paper_files/paper/2018/hash/a41b3bb3e6b050b6c9067c67f663b915-Abstract.html
                10. Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. https://doi.org/10.48550/arXiv.1603.09320
                11. Oikarinen, T., & Weng, T.-W. (2022, September 29). CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks. https://openreview.net/forum?id=iPWiwWHc1V
                12. Parry, R. M., Jones, W., Stokes, T. H., Phan, J. H., Moffitt, R. A., Fang, H., Shi, L., Oberthuer, A., Fischer, M., Tong, W., & Wang, M. D. (2010). k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. The Pharmacogenomics Journal, 10(4), 292–309. https://doi.org/10.1038/tpj.2010.56
                13. Seber, G. A. F., & Lee, A. J. (2003). Linear Regression Analysis (2nd ed.). Wiley.
                14. Todeschini, R., Ballabio, D., & Consonni, V. (2020). Distances and similarity measures in chemometrics and chemoinformatics. In Encyclopedia of Analytical Chemistry (pp. 1–40). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470027318.a9438.pub2
                15. Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. https://doi.org/10.48550/arXiv.1611.03530

                機械学習-不均衡データ (.bib)

                1. Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. Proceedings of the 24th International Conference on Machine Learning, 935–942. https://doi.org/10.1145/1273496.1273614
                2. Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2011). Class imbalance, redux. IEEE 11th International Conference on Data Mining, 754–763. https://doi.org/10.1109/ICDM.2011.33

                機械学習-全般 (.bib)

                1. Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning (Vol. 1). Springer series in statistics New York. http://statweb.stanford.edu/ tibs/book/preface.ps
                2. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. http://www.deeplearningbook.org
                3. James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An Introduction to Statistical Learning, with Applications in Python. Springer.
                4. Murphy, K. P. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. https://probml.github.io/pml-book/book1.html
                5. Raschka, S., & Mirjalili, V. (2020). Python機械学習プログラミング:達人データサイエンティストによる理論と実践 (福島 真太朗 & 株式会社クイープ, Trans.; 第3版). インプレス.
                6. Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
                7. Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
                8. Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3
                9. Zhou, Z.-H. (2021). Machine Learning. Springer Singapore. https://doi.org/10.1007/978-981-15-1967-3
                10. 周志 华. (2022). 機械学習 (大和田勇 人, 玄光 男, 下川朝 有, & 郝新 厂, Trans.). 近代科学社.

                機械学習-基本の枠組み-Bayesian methods-Gaussian processes (.bib)

                1. Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.

                機械学習-基本の枠組み-Bayesian methods-Stochastic Gradient Langevin Dynamics (.bib)

                1. Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.

                機械学習-基本の枠組み-Bayesian methods-Variational inference (.bib)

                1. Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
                2. Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056

                機械学習-基本の枠組み-Bayesian methods-全般 (.bib)

                1. Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
                2. 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.

                機械学習-基本の枠組み-Bayesian methods (.bib)

                1. Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
                2. 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
                3. Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
                4. Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
                5. Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
                6. Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.

                機械学習-基本の枠組み-Hyperparameter tuning (.bib)

                1. Bergstra, J., & Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13.

                機械学習-基本の枠組み-不均衡データ (.bib)

                1. Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. Proceedings of the 24th International Conference on Machine Learning, 935–942. https://doi.org/10.1145/1273496.1273614
                2. Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2011). Class imbalance, redux. IEEE 11th International Conference on Data Mining, 754–763. https://doi.org/10.1109/ICDM.2011.33

                機械学習-基本の枠組み-事前学習 (.bib)

                1. Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html

                機械学習-基本の枠組み-全般 (.bib)

                1. Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning (Vol. 1). Springer series in statistics New York. http://statweb.stanford.edu/ tibs/book/preface.ps
                2. James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An Introduction to Statistical Learning, with Applications in Python. Springer.
                3. Murphy, K. P. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. probml.ai
                4. Raschka, S., & Mirjalili, V. (2020). Python機械学習プログラミング:達人データサイエンティストによる理論と実践 (福島 真太朗 & 株式会社クイープ, Trans.; 第3版). インプレス.
                5. Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
                6. Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
                7. Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3
                8. Zhou, Z.-H. (2021). Machine Learning. Springer Singapore. https://doi.org/10.1007/978-981-15-1967-3
                9. 周志 华. (2022). 機械学習 (大和田勇 人, 玄光 男, 下川朝 有, & 郝新 厂, Trans.). 近代科学社.

                機械学習-基本の枠組み-共通理解 (.bib)

                1. Goldblum, M., Finzi, M., Rowan, K., & Wilson, A. G. (2024). Position: the no free lunch theorem, Kolmogorov complexity, and the role of inductive biases in machine learning. Proceedings of the 41st International Conference on Machine Learning (Position Paper Track), 235, 15788–15808.
                2. No Free Lunch Theorems. Retrieved February 15, 2025, from http://www.no-free-lunch.org/
                3. Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
                4. Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
                5. Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3

                機械学習-基本の枠組み-損失関数 (.bib)

                1. Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494), 746–762. https://doi.org/10.1198/jasa.2011.r10138
                2. Koenker, R. (2005). Quantile Regression (Number 38). Cambridge University Press.
                3. Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643
                4. Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P. H. S., & Dokania, P. K. (2020). Calibrating deep neural networks using focal loss. 15288–15299.
                5. Osband, K. H. (1985). Providing Incentives for Better Cost Forecasting [Phdthesis]. University of California, Berkeley.
                6. Steinwart, I., Pasin, C., Williamson, R., & Zhang, S. (2014). Elicitation and identification of properties. Proceedings of The 27th Conference on Learning Theory, 482–526. https://proceedings.mlr.press/v35/steinwart14.html

                機械学習-基本の枠組み-統計的学習理論 (.bib)

                1. Bach, F. (2024). Learning Theory from First Principles. The MIT Press.
                2. Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
                3. 金森 敬文. (2015). 統計的学習理論. 講談社.
                4. Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of Machine Learning (2nd ed.). The MIT Press.
                5. Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. https://www.cs.huji.ac.il/ shais/UnderstandingMachineLearning/copy.html
                6. Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
                7. Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer New York. https://doi.org/10.1007/978-1-4757-3264-1

                機械学習-基本の枠組み-表現学習 (.bib)

                1. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018, September 27). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. https://openreview.net/forum?id=Bygh9j09KX
                2. Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z
                3. Hermann, K., Mobahi, H., Fel, T., & Mozer, M. C. (2023, October 13). On the foundations of shortcut learning. https://openreview.net/forum?id=Tj3xLVuE9f

                機械学習-基本の枠組み-評価指標 (.bib)

                1. Brodersen, K. H., Ong, C. S., Stephan, K. E., & Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. 2010 20th International Conference on Pattern Recognition, 3121–3124. https://doi.org/10.1109/ICPR.2010.764

                機械学習-基本の枠組み-類似度学習 (.bib)

                1. Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1), 2. https://doi.org/10.3390/technologies9010002
                2. Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb), 207–244. http://www.jmlr.org/papers/v10/weinberger09a.html

                機械学習-基本の枠組み (.bib)

                1. Bach, F. (2024). Learning Theory from First Principles. The MIT Press.
                2. Bergstra, J., & Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13.
                3. Brodersen, K. H., Ong, C. S., Stephan, K. E., & Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. 2010 20th International Conference on Pattern Recognition, 3121–3124. https://doi.org/10.1109/ICPR.2010.764
                4. Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html
                5. Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
                6. Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
                7. 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
                8. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018, September 27). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. https://openreview.net/forum?id=Bygh9j09KX
                9. Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z
                10. Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
                11. Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494), 746–762. https://doi.org/10.1198/jasa.2011.r10138
                12. Goldblum, M., Finzi, M., Rowan, K., & Wilson, A. G. (2024). Position: the no free lunch theorem, Kolmogorov complexity, and the role of inductive biases in machine learning. Proceedings of the 41st International Conference on Machine Learning (Position Paper Track), 235, 15788–15808.
                13. Hermann, K., Mobahi, H., Fel, T., & Mozer, M. C. (2023, October 13). On the foundations of shortcut learning. https://openreview.net/forum?id=Tj3xLVuE9f
                14. Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1), 2. https://doi.org/10.3390/technologies9010002
                15. 金森 敬文. (2015). 統計的学習理論. 講談社.
                16. Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
                17. Koenker, R. (2005). Quantile Regression (Number 38). Cambridge University Press.
                18. Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643
                19. Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of Machine Learning (2nd ed.). The MIT Press.
                20. Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P. H. S., & Dokania, P. K. (2020). Calibrating deep neural networks using focal loss. 15288–15299.
                21. No Free Lunch Theorems. Retrieved February 15, 2025, from http://www.no-free-lunch.org/
                22. Osband, K. H. (1985). Providing Incentives for Better Cost Forecasting [Phdthesis]. University of California, Berkeley.
                23. Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
                24. Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. https://www.cs.huji.ac.il/ shais/UnderstandingMachineLearning/copy.html
                25. Steinwart, I., Pasin, C., Williamson, R., & Zhang, S. (2014). Elicitation and identification of properties. Proceedings of The 27th Conference on Learning Theory, 482–526. https://proceedings.mlr.press/v35/steinwart14.html
                26. Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
                27. Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. Proceedings of the 24th International Conference on Machine Learning, 935–942. https://doi.org/10.1145/1273496.1273614
                28. Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer New York. https://doi.org/10.1007/978-1-4757-3264-1
                29. Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2011). Class imbalance, redux. IEEE 11th International Conference on Data Mining, 754–763. https://doi.org/10.1109/ICDM.2011.33
                30. Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb), 207–244. http://www.jmlr.org/papers/v10/weinberger09a.html
                31. Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.
                32. Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
                33. Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
                34. Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3

                機械学習-損失関数 (.bib)

                1. Koenker, R. (2005). Quantile Regression (Number 38). Cambridge University Press.
                2. Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643

                機械学習-統計的学習理論 (.bib)

                1. Bach, F. (2024). Learning Theory from First Principles. The MIT Press.
                2. 金森 敬文. (2015). 統計的学習理論. 講談社.
                3. Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of Machine Learning (2nd ed.). The MIT Press.
                4. Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. https://www.cs.huji.ac.il/ shais/UnderstandingMachineLearning/copy.html
                5. Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer New York. https://doi.org/10.1007/978-1-4757-3264-1

                機械学習-解釈性 (.bib)

                1. Molnar, C. Interpretable Machine Learning. Retrieved February 23, 2025, from https://christophm.github.io/interpretable-ml-book/
                2. Molnar, C. Interpretable Machine Learning(邦訳). Retrieved February 23, 2025, from https://hacarus.github.io/interpretable-ml-book-ja/index.html

                機械学習-評価指標 (.bib)

                  機械学習 (.bib)

                  1. 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
                  2. 7群7編 分散協調とエージェント – 電子情報通信学会知識ベース. Retrieved July 5, 2025, from https://www.ieice-hbkb.org/portal/7%e7%be%a4%e3%80%80%e3%82%b3%e3%83%b3%e3%83%94%e3%83%a5%e3%83%bc%e3%82%bf-%ef%bd%bf%ef%be%8c%ef%be%84%ef%bd%b3%ef%bd%aa%ef%bd%b1/7%e7%be%a47%e7%b7%a8-%e5%88%86%e6%95%a3%e5%8d%94%e8%aa%bf%e3%81%a8%e3%82%a8%e3%83%bc%e3%82%b8%e3%82%a7%e3%83%b3%e3%83%88/
                  3. Abu Alfeilat, H. A., Hassanat, A. B. A., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., Eyal Salman, H. S., & Prasath, V. B. S. (2019). Effects of distance measure choice on K-nearest neighbor classifier performance: a review. Big Data, 7(4), 221–248. https://doi.org/10.1089/big.2018.0175
                  4. Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
                  5. Ali, M., Fromm, M., Thellmann, K., Rutmann, R., Lübbering, M., Leveling, J., Klug, K., Ebert, J., Doll, N., Buschhoff, J., Jain, C., Weber, A., Jurkschat, L., Abdelwahab, H., John, C., Ortiz Suarez, P., Ostendorff, M., Weinbach, S., Sifa, R., … Flores-Herr, N. (2024). Tokenizer choice for LLM training: negligible or crucial? In K. Duh, H. Gomez, & S. Bethard (Eds.), Findings of the Association for Computational Linguistics: NAACL 2024 (pp. 3907–3924). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-naacl.247
                  6. Angelopoulos, A. N., & Bates, S. (2022). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. http://arxiv.org/abs/2107.07511
                  7. Arnett, C., Chang, T. A., & Bergen, B. (2024). A Bit of a Problem: Measurement Disparities in Dataset Sizes across Languages. In M. Melero, S. Sakti, & C. Soria (Eds.), Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024 (pp. 1–9). ELRA and ICCL. https://aclanthology.org/2024.sigul-1.1/
                  8. Arnett, C., & Bergen, B. (2025). Why do language models perform worse for morphologically complex languages? In O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, & S. Schockaert (Eds.), Proceedings of the 31st International Conference on Computational Linguistics (pp. 6607–6623). Association for Computational Linguistics. https://aclanthology.org/2025.coling-main.441/
                  9. Arora, S., Liang, Y., & Ma, T. (2017, February 6). A simple but tough-to-beat baseline for sentence embeddings. https://openreview.net/forum?id=SyK00v5xx
                  10. Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
                  11. Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
                  12. Aumüller, M., Bernhardsson, E., & Faithfull, A. (2018). ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. https://doi.org/10.48550/arXiv.1807.05614
                  13. Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
                  14. Bach, F. (2024). Learning Theory from First Principles. The MIT Press.
                  15. Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Y. Bengio & Y. LeCun (Eds.), Conference Track Proceedings of the 3rd International Conference on Learning Representations. http://arxiv.org/abs/1409.0473
                  16. Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., … Kaplan, J. (2022). Constitutional AI: harmlessness from AI feedback. https://doi.org/10.48550/arXiv.2212.08073
                  17. Belkin, M., Rakhlin, A., & Tsybakov, A. B. (2019). Does data interpolation contradict statistical optimality? The 22nd International Conference on Artificial Intelligence and Statistics, 1611–1619. http://proceedings.mlr.press/v89/belkin19a.html
                  18. Be My Eyes (Ed.). (2024). Be My Eyes Accessibility with GPT-4o. https://www.youtube.com/watch?v=Zq710AKC1gg
                  19. Bergstra, J., & Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13.
                  20. Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
                  21. Bridle, J. S. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. F. Soulié & J. Hérault (Eds.), Neurocomputing (pp. 227–236). Springer. https://doi.org/10.1007/978-3-642-76153-9_28
                  22. Brodersen, K. H., Ong, C. S., Stephan, K. E., & Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. 2010 20th International Conference on Pattern Recognition, 3121–3124. https://doi.org/10.1109/ICPR.2010.764
                  23. Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T., & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. https://doi.org/10.48550/arXiv.2303.12712
                  24. C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
                  25. Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
                  26. Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
                  27. Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
                  28. Chan, C., Ginosar, S., Zhou, T., & Efros, A. (2019). Everybody dance now. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 5932–5941. https://doi.org/10.1109/ICCV.2019.00603
                  29. Chen, S., Wong, S., Chen, L., & Tian, Y. (2023). Extending context window of large language models via positional interpolation. https://doi.org/10.48550/arXiv.2306.15595
                  30. Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html
                  31. Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111. https://doi.org/10.1103/PhysRevE.70.066111
                  32. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. (2022). Engineering Applications of Artificial Intelligence, 110, 104743. https://doi.org/10.1016/j.engappai.2022.104743
                  33. Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
                  34. Dathathri, S., See, A., Ghaisas, S., Huang, P.-S., McAdam, R., Welbl, J., Bachani, V., Kaskasoli, A., Stanforth, R., Matejovicova, T., Hayes, J., Vyas, N., Merey, M. A., Brown-Cohen, J., Bunel, R., Balle, B., Cemgil, T., Ahmed, Z., Stacpoole, K., … Kohli, P. (2024). Scalable watermarking for identifying large language model outputs. Nature, 634(8035), 818–823. https://doi.org/10.1038/s41586-024-08025-4
                  35. DeepSeek-AI, Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z. F., Gou, Z., Shao, Z., Li, Z., Gao, Z., … Zhang, Z. (2025). DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. https://doi.org/10.48550/arXiv.2501.12948
                  36. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. https://doi.org/10.1109/CVPR.2009.5206848
                  37. Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU
                  38. 電子情報通信学会. (2019). 7群-7編-1章 エージェントの定義・モデル・概念. In 電子情報通信学会知識ベース (ver.1 ed., Number 7群7編). https://www.ieice-hbkb.org/files/ad_base/view_pdf.html?p=/files/07/07gun_07hen_01.pdf
                  39. Distributional Hypothesis - ACL Wiki. Retrieved June 22, 2025, from https://www.aclweb.org/aclwiki/index.php?title=Distributional_Hypothesis
                  40. Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
                  41. drhead. (2024). The VAE used for Stable Diffusion 1.x/2.x and other models (KL-F8) has a critical flaw, probably due to bad training, that is holding back all models that use it (almost certainly including DALL-E 3). [Reddit Post]. r/StableDiffusion. https://www.reddit.com/r/StableDiffusion/comments/1ag5h5s/the_vae_used_for_stable_diffusion_1x2x_and_other/
                  42. The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
                  43. Dutta, D., Ansari, F., Chakrabarty, A., & Das, S. (2025). On the existence of universal simulators of attention. https://doi.org/10.48550/arXiv.2506.18739
                  44. Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
                  45. Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
                  46. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: an early look at the labor market impact potential of large language models. https://doi.org/10.48550/arXiv.2303.10130
                  47. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2024). GPTs are GPTs: Labor market impact potential of LLMs. Science, 384(6702), 1306–1308. https://doi.org/10.1126/science.adj0998
                  48. Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
                  49. Entezari, R., Wortsman, M., Saukh, O., Shariatnia, M. M., Sedghi, H., & Schmidt, L. (2023). The role of pre-training data in transfer learning. https://doi.org/10.48550/arXiv.2302.13602
                  50. Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining.
                  51. 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
                  52. Fedus, W., Zoph, B., & Shazeer, N. (2022). Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23(1), 120:5232–5120:5270.
                  53. Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2021). Sharpness-Aware Minimization for Efficiently Improving Generalization. https://doi.org/10.48550/arXiv.2010.01412
                  54. Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
                  55. Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning (Vol. 1). Springer series in statistics New York. http://statweb.stanford.edu/ tibs/book/preface.ps
                  56. 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/
                  57. Fu, L., Yang, B., Kuang, Z., Song, J., Li, Y., Zhu, L., Luo, Q., Wang, X., Lu, H., Huang, M., Li, Z., Tang, G., Shan, B., Lin, C., Liu, Q., Wu, B., Feng, H., Liu, H., Huang, C., … Bai, X. (2024). OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning (Version 1). https://doi.org/10.48550/arXiv.2501.00321
                  58. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018, September 27). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. https://openreview.net/forum?id=Bygh9j09KX
                  59. Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z
                  60. Gemini Team, Google. (2025). Gemini 2.5: pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities. https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf
                  61. Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
                  62. Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
                  63. Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826. https://doi.org/10.1073/pnas.122653799
                  64. Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494), 746–762. https://doi.org/10.1198/jasa.2011.r10138
                  65. Goldblum, M., Finzi, M., Rowan, K., & Wilson, A. G. (2024). Position: the no free lunch theorem, Kolmogorov complexity, and the role of inductive biases in machine learning. Proceedings of the 41st International Conference on Machine Learning (Position Paper Track), 235, 15788–15808.
                  66. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. http://www.deeplearningbook.org
                  67. Gottesman, Y. (2023). Understand diffusion models with VAEs. Yoni Gottesman. https://yonigottesman.github.io/2023/03/11/vae.html
                  68. Guan, M. Y., Joglekar, M., Wallace, E., Jain, S., Barak, B., Helyar, A., Dias, R., Vallone, A., Ren, H., Wei, J., Chung, H. W., Toyer, S., Heidecke, J., Beutel, A., & Glaese, A. (2025). Deliberative alignment: reasoning enables safer language models. https://doi.org/10.48550/arXiv.2412.16339
                  69. Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. Proceedings of the 34th International Conference on Machine Learning - Volume 70, 1321–1330.
                  70. Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
                  71. Hayou, S., Ghosh, N., & Yu, B. (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proceedings of the 41st International Conference on Machine Learning, 17783–17806. https://proceedings.mlr.press/v235/hayou24a.html
                  72. 横井 祥. (2024). 「確率的なオウム」にできること、またそれがなぜできるのかについて. https://speakerdeck.com/eumesy/language-models-as-modern-version-of-the-use-theory-of-meaning
                  73. Henighan, T., Kaplan, J., Katz, M., Chen, M., Hesse, C., Jackson, J., Jun, H., Brown, T. B., Dhariwal, P., Gray, S., Hallacy, C., Mann, B., Radford, A., Ramesh, A., Ryder, N., Ziegler, D. M., Schulman, J., Amodei, D., & McCandlish, S. (2020). Scaling laws for autoregressive generative modeling. https://doi.org/10.48550/arXiv.2010.14701
                  74. Hermann, K., Mobahi, H., Fel, T., & Mozer, M. C. (2023, October 13). On the foundations of shortcut learning. https://openreview.net/forum?id=Tj3xLVuE9f
                  75. Hernandez, D., Kaplan, J., Henighan, T., & McCandlish, S. (2021). Scaling laws for transfer. https://doi.org/10.48550/arXiv.2102.01293
                  76. Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. https://doi.org/10.48550/arXiv.1207.0580
                  77. Ho, J., Jain, A., & Abbeel, P. Denoising diffusion probabilistic models.
                  78. An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
                  79. Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
                  80. Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2019, September 25). The curious case of neural text degeneration. https://openreview.net/forum?id=rygGQyrFvH
                  81. Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2022). LoRA: Low-rank adaptation of large language models. International Conference on Learning Representations. https://openreview.net/forum?id=nZeVKeeFYf9
                  82. Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
                  83. Hutchins, J. (1995). "The whisky was invisible", or Persistent myths of MT. MT News International, 11. https://web.archive.org/web/20210103041306/http://www.hutchinsweb.me.uk/MTNI-11-1995.pdf
                  84. Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1), 2. https://doi.org/10.3390/technologies9010002
                  85. James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An Introduction to Statistical Learning, with Applications in Python. Springer.
                  86. Jin, M., Yu, Q., Shu, D., Zhao, H., Hua, W., Meng, Y., Zhang, Y., & Du, M. (2024). The impact of reasoning step length on large language models (L.-W. Ku, A. Martins, & V. Srikumar, Eds.; pp. 1830–1842). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.108
                  87. 金森 敬文. (2015). 統計的学習理論. 講談社.
                  88. Jocher, G., Qiu, J., & Chaurasia, A. (2023). Ultralytics YOLO (Version 8.0.0). https://github.com/ultralytics/ultralytics
                  89. John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
                  90. Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
                  91. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2
                  92. Kallini, J., Papadimitriou, I., Futrell, R., Mahowald, K., & Potts, C. (2024). Mission: Impossible language models. In L.-W. Ku, A. Martins, & V. Srikumar (Eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 14691–14714). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-long.787
                  93. Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
                  94. Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
                  95. Khan, F. (2025). FareedKhan-dev/train-deepseek-r1. https://github.com/FareedKhan-dev/train-deepseek-r1
                  96. Kim, J. (2025). kjsman/stable-diffusion-pytorch. https://github.com/kjsman/stable-diffusion-pytorch
                  97. Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
                  98. Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
                  99. Koenker, R. (2005). Quantile Regression (Number 38). Cambridge University Press.
                  100. Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643
                  101. Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
                  102. Lambert, N., Morrison, J., Pyatkin, V., Huang, S., Ivison, H., Brahman, F., Miranda, L. J. V., Liu, A., Dziri, N., Lyu, S., Gu, Y., Malik, S., Graf, V., Hwang, J. D., Yang, J., Bras, R. L., Tafjord, O., Wilhelm, C., Soldaini, L., … Hajishirzi, H. (2025). Tulu 3: Pushing Frontiers in Open Language Model Post-Training. https://doi.org/10.48550/arXiv.2411.15124
                  103. Learn Prompting: Your Guide to Communicating with AI. (2023). https://learnprompting.org/
                  104. Lee, K.-H., Fischer, I., Wu, Y.-H., Marwood, D., Baluja, S., Schuurmans, D., & Chen, X. (2025). Evolving deeper LLM thinking. https://doi.org/10.48550/arXiv.2501.09891
                  105. Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915
                  106. Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
                  107. Liu, R., Gao, J., Zhao, J., Zhang, K., Li, X., Qi, B., Ouyang, W., & Zhou, B. (2025). Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling. https://doi.org/10.48550/arXiv.2502.06703
                  108. Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
                  109. Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
                  110. Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018). Visualizing the loss landscape of neural nets. Advances in Neural Information Processing Systems, 31. https://proceedings.neurips.cc/paper_files/paper/2018/hash/a41b3bb3e6b050b6c9067c67f663b915-Abstract.html
                  111. Lu, Y., & Morgan, J. L. (2020). Homophone auditory processing in cross-linguistic perspective. Proceedings of the Linguistic Society of America, 5(1), 529–542. https://doi.org/10.3765/plsa.v5i1.4733
                  112. Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. https://doi.org/10.48550/arXiv.1603.09320
                  113. Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT
                  114. Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
                  115. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, 2, 3111–3119.
                  116. Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
                  117. Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
                  118. Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
                  119. Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of Machine Learning (2nd ed.). The MIT Press.
                  120. Molnar, C. Interpretable Machine Learning. Retrieved February 23, 2025, from https://christophm.github.io/interpretable-ml-book/
                  121. Molnar, C. Interpretable Machine Learning(邦訳). Retrieved February 23, 2025, from https://hacarus.github.io/interpretable-ml-book-ja/index.html
                  122. Muennighoff, N., Yang, Z., Shi, W., Li, X. L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E., & Hashimoto, T. (2025). s1: Simple test-time scaling. https://doi.org/10.48550/arXiv.2501.19393
                  123. Muennighoff, N., Rush, A. M., Barak, B., Scao, T. L., Tazi, N., Piktus, A., Pyysalo, S., Wolf, T., & Raffel, C. (2023, November 2). Scaling data-constrained language models. https://openreview.net/forum?id=j5BuTrEj35
                  124. Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P. H. S., & Dokania, P. K. (2020). Calibrating deep neural networks using focal loss. 15288–15299.
                  125. Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406
                  126. Newman, M. E. J. (2004). Fast algorithm for detecting community structure in networks. Physical Review E, 69(6), 066133. https://doi.org/10.1103/PhysRevE.69.066133
                  127. Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113. https://doi.org/10.1103/PhysRevE.69.026113
                  128. Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
                  129. Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
                  130. No Free Lunch Theorems. Retrieved February 15, 2025, from http://www.no-free-lunch.org/
                  131. Odena, A., Dumoulin, V., & Olah, C. (2016). Deconvolution and checkerboard artifacts. Distill, 1(10), e3. https://doi.org/10.23915/distill.00003
                  132. Oikarinen, T., & Weng, T.-W. (2022, September 29). CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks. https://openreview.net/forum?id=iPWiwWHc1V
                  133. OpenAI Platform. Retrieved August 29, 2024, from https://platform.openai.com
                  134. OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
                  135. OpenAI. (2024). Learning to reason with LLMs. https://openai.com/index/learning-to-reason-with-llms/
                  136. Osband, K. H. (1985). Providing Incentives for Better Cost Forecasting [Phdthesis]. University of California, Berkeley.
                  137. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
                  138. Parry, R. M., Jones, W., Stokes, T. H., Phan, J. H., Moffitt, R. A., Fang, H., Shi, L., Oberthuer, A., Fischer, M., Tong, W., & Wang, M. D. (2010). k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. The Pharmacogenomics Journal, 10(4), 292–309. https://doi.org/10.1038/tpj.2010.56
                  139. Pelletier, B. (2024). On the statistical properties of the isolation forest anomaly detection method. https://hal.science/hal-04430185
                  140. 坪井 祐太, 海野 裕也, & 鈴木 潤. (2017). 深層学習による自然言語処理. 講談社.
                  141. Plesner, A., Vontobel, T., & Wattenhofer, R. (2024). Breaking reCAPTCHAv2. https://doi.org/10.1109/COMPSAC61105.2024.00142
                  142. Murphy, K. P. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. https://probml.github.io/pml-book/book1.html
                  143. Prompt Engineering Guide – Nextra. Retrieved August 29, 2024, from https://www.promptingguide.ai/
                  144. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learning, 8748–8763. https://proceedings.mlr.press/v139/radford21a.html
                  145. Rafailov, R., Hejna, J., Park, R., & Finn, C. (2024, August 26). From \r to {Q^*\: Your Language Model is Secretly a Q-Function. https://openreview.net/forum?id=kEVcNxtqXk
                  146. Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
                  147. Raghu, M., Zhang, C., Kleinberg, J., & Bengio, S. (2019). Transfusion: understanding transfer learning for medical imaging. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 301, pp. 3347–3357). Curran Associates Inc.
                  148. Raschka, S., & Mirjalili, V. (2020). Python機械学習プログラミング:達人データサイエンティストによる理論と実践 (福島 真太朗 & 株式会社クイープ, Trans.; 第3版). インプレス.
                  149. Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
                  150. Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
                  151. A generalist agent. (2022). Transactions on Machine Learning Research. https://openreview.net/forum?id=1ikK0kHjvj
                  152. Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr., 3(4), 333–389. https://doi.org/10.1561/1500000019
                  153. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. 10674–10685. https://doi.org/10.1109/CVPR52688.2022.01042
                  154. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (pp. 234–241). Springer International Publishing. https://doi.org/10.1007/978-3-319-24574-4_28
                  155. R, F. I. R. T. H. J. (1957). A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis. https://cir.nii.ac.jp/crid/1570854175539816192?lang=en
                  156. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252. https://doi.org/10.1007/s11263-015-0816-y
                  157. Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
                  158. Sardana, N., Portes, J., Doubov, S., & Frankle, J. (2024). Beyond Chinchilla-optimal: accounting for inference in language model scaling laws. Proceedings of the 41st International Conference on Machine Learning, 235, 43445–43460.
                  159. Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
                  160. Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst., 42(3), 19:1–19:21. https://doi.org/10.1145/3068335
                  161. Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., Li, Y., Gupta, A., Han, H. J., Schulhoff, S., Dulepet, P. S., Vidyadhara, S., Ki, D., Agrawal, S., Pham, C., Kroiz, G., Li, F., Tao, H., Srivastava, A., … Resnik, P. (2024). The Prompt Report: A Systematic Survey of Prompting Techniques. http://arxiv.org/abs/2406.06608
                  162. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
                  163. Seber, G. A. F., & Lee, A. J. (2003). Linear Regression Analysis (2nd ed.). Wiley.
                  164. Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. https://www.cs.huji.ac.il/ shais/UnderstandingMachineLearning/copy.html
                  165. 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
                  166. 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
                  167. Sharif, M., Bhagavatula, S., Bauer, L., & Reiter, M. K. (2016). Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 1528–1540. https://doi.org/10.1145/2976749.2978392
                  168. 石井 健一郎, & 上田 修功. (2014). 教師なし学習入門 (わかりやすいパターン認識 続). オーム社.
                  169. Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. R., & Yao, S. (2023, November 2). Reflexion: language agents with verbal reinforcement learning. https://openreview.net/forum?id=vAElhFcKW6
                  170. SIGKDD News: 2014 SIGKDD Test of Time Award. (2014). https://www.kdd.org/News/view/2014-sigkdd-test-of-time-award
                  171. Singhal, S., Zeng, J., Bukharin, A., Zhang, Y., Shen, G., Mahabaleshwarkar, A. S., Kartal, B., Suhara, Y., Bercovich, A., Levy, I., Golan, I., Dabbah, M., El-Yaniv, R., Majumdar, S., Gitman, I., Bakhturina, E., Zhang, J. J., Su, B.-Y., Huang, G., … Konuk, T. (2025, June 21). Llama-Nemotron: Efficient Reasoning Models. https://openreview.net/forum?id=ev1xpo9mbI&referrer=%5Bthe%20profile%20of%20Olivier%20Delalleau%5D(%2Fprofile%3Fid%3D Olivier_Delalleau1)
                  172. Snell, C. V., Lee, J., Xu, K., & Kumar, A. (2024, October 4). Scaling LLM test-time compute optimally can be more effective than scaling parameters for reasoning. https://openreview.net/forum?id=4FWAwZtd2n
                  173. Song, Y., & Ermon, S. (2019). Generative modeling by estimating gradients of the data distribution. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1067, pp. 11918–11930). Curran Associates Inc.
                  174. Song, Y. (2021). Generative modeling by estimating gradients of the data distribution. Yang Song. https://yang-song.net/blog/2021/score
                  175. Song, Y., Durkan, C., Murray, I., & Ermon, S. (2021, November 9). Maximum likelihood training of score-based diffusion models. https://openreview.net/forum?id=AklttWFnxS9
                  176. Speech and Language Processing. Retrieved June 18, 2025, from https://web.stanford.edu/ jurafsky/slp3/
                  177. Steinwart, I., Pasin, C., Williamson, R., & Zhang, S. (2014). Elicitation and identification of properties. Proceedings of The 27th Conference on Learning Theory, 482–526. https://proceedings.mlr.press/v35/steinwart14.html
                  178. Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
                  179. Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., & Liu, Y. (2024). RoFormer: enhanced transformer with rotary position embedding. Neurocomputing, 568(C). https://doi.org/10.1016/j.neucom.2023.127063
                  180. Szeliski, R. (2022). Computer Vision: Algorithms and Applications. Springer. https://szeliski.org/Book/
                  181. Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
                  182. Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
                  183. THU-MIG/yolov10. (2024). THU-MIG. https://github.com/THU-MIG/yolov10
                  184. Todeschini, R., Ballabio, D., & Consonni, V. (2020). Distances and similarity measures in chemometrics and chemoinformatics. In Encyclopedia of Analytical Chemistry (pp. 1–40). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470027318.a9438.pub2
                  185. Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
                  186. Ultralytics. Ultralytics YOLO11 object detection model. Retrieved June 12, 2025, from https://github.com/ultralytics/ultralytics/blob/da98efc61d9e0467315fc86c2297c8d81e656b1a/ultralytics/cfg/models/11/yolo11.yaml
                  187. Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
                  188. Neural discrete representation learning. (2017). Proceedings of the 31st International Conference on Neural Information Processing Systems, 6309–6318.
                  189. Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. Proceedings of the 24th International Conference on Machine Learning, 935–942. https://doi.org/10.1145/1273496.1273614
                  190. Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer New York. https://doi.org/10.1007/978-1-4757-3264-1
                  191. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000–6010.
                  192. Vincent, P. (2011). A connection between score matching and denoising autoencoders. Neural Computation, 23(7), 1661–1674. https://doi.org/10.1162/NECO_a_00142
                  193. Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
                  194. Vovk, V., Gammerman, A., & Shafer, G. (2022). Algorithmic Learning in a Random World. Springer International Publishing. https://doi.org/10.1007/978-3-031-06649-8
                  195. Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2011). Class imbalance, redux. IEEE 11th International Conference on Data Mining, 754–763. https://doi.org/10.1109/ICDM.2011.33
                  196. Wang, X., & Zhou, D. (2024, November 6). Chain-of-Thought Reasoning Without Prompting. https://openreview.net/forum?id=4Zt7S0B0Jp
                  197. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (T. Linzen, G. Chrupała, & A. Alishahi, Eds.; pp. 353–355). Association for Computational Linguistics. https://doi.org/10.18653/v1/W18-5446
                  198. Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387
                  199. Wang, Y., Yang, Q., Zeng, Z., Ren, L., Liu, L., Peng, B., Cheng, H., He, X., Wang, K., Gao, J., Chen, W., Wang, S., Du, S. S., & Shen, Y. (2025). Reinforcement learning for reasoning in large language models with one training example. https://doi.org/10.48550/arXiv.2504.20571
                  200. Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. https://doi.org/10.48550/arXiv.2405.14458
                  201. Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. Advances in Neural Information Processing Systems, 37, 107984–108011. https://proceedings.neurips.cc/paper_files/paper/2024/hash/c34ddd05eb089991f06f3c5dc36836e0-Abstract-Conference.html
                  202. Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845
                  203. Warner, B., Chaffin, A., Clavié, B., Weller, O., Hallström, O., Taghadouini, S., Gallagher, A., Biswas, R., Ladhak, F., Aarsen, T., Cooper, N., Adams, G., Howard, J., & Poli, I. (2024). Smarter, better, faster, longer: a modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference. https://doi.org/10.48550/arXiv.2412.13663
                  204. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.
                  205. Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
                  206. Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ
                  207. Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb), 207–244. http://www.jmlr.org/papers/v10/weinberger09a.html
                  208. Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.
                  209. Wen, X., Liu, Z., Zheng, S., Xu, Z., Ye, S., Wu, Z., Liang, X., Wang, Y., Li, J., Miao, Z., Bian, J., & Yang, M. (2025). Reinforcement learning with verifiable rewards implicitly incentivizes correct reasoning in base LLMs. https://doi.org/10.48550/arXiv.2506.14245
                  210. Wettig, A., Gao, T., Zhong, Z., & Chen, D. (2023). Should you mask 15% in masked language modeling? In A. Vlachos & I. Augenstein (Eds.), Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (pp. 2985–3000). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.eacl-main.217
                  211. Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
                  212. Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
                  213. Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3
                  214. Wu, Y., & He, K. (2018). Group normalization. Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XIII, 3–19. https://doi.org/10.1007/978-3-030-01261-8_1
                  215. Wu, T., Lan, J., Yuan, W., Jiao, J., Weston, J. E., & Sukhbaatar, S. (2025, June 18). Thinking LLMs: general instruction following with thought generation. https://openreview.net/forum?id=z6SrgYCdey&noteId=t3y0Ev0lm6
                  216. 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
                  217. Xiong, R., Yang, Y., He, D., Zheng, K., Zheng, S., Xing, C., Zhang, H., Lan, Y., Wang, L., & Liu, T.-Y. (2020). On layer normalization in the transformer architecture. Proceedings of the 37th International Conference on Machine Learning, 119, 10524–10533.
                  218. Xu, D., & Tian, Y. (2015). A Comprehensive Survey of Clustering Algorithms. Annals of Data Science, 2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1
                  219. Xu, H., Xie, S., Tan, X., Huang, P.-Y., Howes, R., Sharma, V., Li, S.-W., Ghosh, G., Zettlemoyer, L., & Feichtenhofer, C. (2023, October 13). Demystifying CLIP data. https://openreview.net/forum?id=5BCFlnfE1g
                  220. Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
                  221. Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
                  222. Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/
                  223. 岩田 具治. (2015). トピックモデル. 講談社.
                  224. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
                  225. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
                  226. Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. R. (2023, November 2). Tree of Thoughts: deliberate problem solving with large language models. https://openreview.net/forum?id=5Xc1ecxO1h
                  227. 原田 達也. (2017). 画像認識. 講談社. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003372347
                  228. Yue, Z., Zhuang, H., Bai, A., Hui, K., Jagerman, R., Zeng, H., Qin, Z., Wang, D., Wang, X., & Bendersky, M. (2024, October 4). Inference Scaling for Long-Context Retrieval Augmented Generation. https://openreview.net/forum?id=FSjIrOm1vz
                  229. Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33(4), 452–473. http://www.jstor.org/stable/3629752
                  230. Zelikman, E., Harik, G. R., Shao, Y., Jayasiri, V., Haber, N., & Goodman, N. (2024, August 26). Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking. https://openreview.net/forum?id=oRXPiSOGH9
                  231. Zelikman, E., Wu, Y., Mu, J., & Goodman, N. (2022, October 31). STaR: bootstrapping reasoning with reasoning. https://openreview.net/forum?id=_3ELRdg2sgI
                  232. Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
                  233. 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.
                  234. Zhang, Y., & Teng, Z. (2021). Natural Language Processing: A Machine Learning Perspective. Cambridge University Press.
                  235. Zhang, B., & Sennrich, R. (2019). Root mean square layer normalization. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1110, pp. 12381–12392). Curran Associates Inc.
                  236. Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. https://doi.org/10.48550/arXiv.1611.03530
                  237. Zhao, X., Kang, Z., Feng, A., Levine, S., & Song, D. (2025). Learning to reason without external rewards. https://doi.org/10.48550/arXiv.2505.19590
                  238. Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao
                  239. Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he
                  240. Zhou, Z.-H. (2021). Machine Learning. Springer Singapore. https://doi.org/10.1007/978-981-15-1967-3
                  241. Zhou, C., Yu, L., Babu, A., Tirumala, K., Yasunaga, M., Shamis, L., Kahn, J., Ma, X., Zettlemoyer, L., & Levy, O. (2024, October 4). Transfusion: predict the next token and diffuse images with one multi-modal model. https://openreview.net/forum?id=SI2hI0frk6
                  242. 周志 华. (2022). 機械学習 (大和田勇 人, 玄光 男, 下川朝 有, & 郝新 厂, Trans.). 近代科学社.

                  用語集 (.bib)

                  1. Kotz, S., Balakrishnan, N., Read, C. B., & Vidakovic, B. (Eds.). (2006). Encyclopedia of Statistical Sciences (2nd ed). Wiley-Interscience.
                  2. Sammut, C., & Webb, G. I. (Eds.). (2017). Encyclopedia of Machine Learning and Data Mining. Springer US. https://doi.org/10.1007/978-1-4899-7687-1

                  科学哲学-帰納バイアス (.bib)

                  1. Baker, A. (2022). Simplicity. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2022). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2022/entries/simplicity/

                  科学哲学 (.bib)

                  1. Baker, A. (2022). Simplicity. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2022). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2022/entries/simplicity/
                  2. Bell, E., Bryman, A., & Harley, B. (2022). Business Research Methods (6th ed.). Oxford Univ Press.

                  経営学の方法論-データ収集 (.bib)

                  1. Hand, M. (2014). From Cyberspace to the Dataverse: Trajectories in Digital Social Research. In Big Data? Qualitative Approaches to Digital Research (Vol. 13, pp. 1–27). Emerald Group Publishing Limited. https://doi.org/10.1108/S1042-319220140000013002
                  2. Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1). https://doi.org/10.1177/2053951714528481

                  経営学の方法論-研究事例 (.bib)

                  1. Chen, C. C., & Meindl, J. R. (1991). The construction of leadership images in the popular press: the case of Donald Burr and People Express. Administrative Science Quarterly, 36(4), 521. https://doi.org/10.2307/2393273
                  2. Webb, E. J., Campbell, D. T., Schwartz, R. D., & Sechrest, L. (1966). Unobtrusive Measures: Nonreactive Research in the Social Sciences (pp. xii, 225). Rand Mcnally.
                  3. Zamanou, S., & Glaser, S. R. (1994). Moving toward participation and involvement: Managing and measuring organizational culture. Group & Organization Management, 19(4), 475–502. https://doi.org/10.1177/1059601194194005

                  経営学の方法論 (.bib)

                  1. Bell, E., Bryman, A., & Harley, B. (2022). Business Research Methods (6th ed.). Oxford Univ Press.
                  2. Chen, C. C., & Meindl, J. R. (1991). The construction of leadership images in the popular press: the case of Donald Burr and People Express. Administrative Science Quarterly, 36(4), 521. https://doi.org/10.2307/2393273
                  3. Creswell, J. W., & Plano Clark, V. L. (2017). Designing and Conducting Mixed Methods Research (3rd ed.). SAGE Publications.
                  4. Hand, M. (2014). From Cyberspace to the Dataverse: Trajectories in Digital Social Research. In Big Data? Qualitative Approaches to Digital Research (Vol. 13, pp. 1–27). Emerald Group Publishing Limited. https://doi.org/10.1108/S1042-319220140000013002
                  5. Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1). https://doi.org/10.1177/2053951714528481
                  6. Webb, E. J., Campbell, D. T., Schwartz, R. D., & Sechrest, L. (1966). Unobtrusive Measures: Nonreactive Research in the Social Sciences (pp. xii, 225). Rand Mcnally.
                  7. Zamanou, S., & Glaser, S. R. (1994). Moving toward participation and involvement: Managing and measuring organizational culture. Group & Organization Management, 19(4), 475–502. https://doi.org/10.1177/1059601194194005

                  提供参考文献 (.bib)

                  1. 100,000 H100 clusters: power, network topology, ethernet vs infiniband, reliability, failures, checkpointing. (2024). SemiAnalysis. https://semianalysis.com/2024/06/17/100000-h100-clusters-power-network/
                  2. 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
                  3. 7群7編 分散協調とエージェント – 電子情報通信学会知識ベース. Retrieved July 5, 2025, from https://www.ieice-hbkb.org/portal/7%e7%be%a4%e3%80%80%e3%82%b3%e3%83%b3%e3%83%94%e3%83%a5%e3%83%bc%e3%82%bf-%ef%bd%bf%ef%be%8c%ef%be%84%ef%bd%b3%ef%bd%aa%ef%bd%b1/7%e7%be%a47%e7%b7%a8-%e5%88%86%e6%95%a3%e5%8d%94%e8%aa%bf%e3%81%a8%e3%82%a8%e3%83%bc%e3%82%b8%e3%82%a7%e3%83%b3%e3%83%88/
                  4. Abu Alfeilat, H. A., Hassanat, A. B. A., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., Eyal Salman, H. S., & Prasath, V. B. S. (2019). Effects of distance measure choice on K-nearest neighbor classifier performance: a review. Big Data, 7(4), 221–248. https://doi.org/10.1089/big.2018.0175
                  5. Aggarwal, C. C., Hinneburg, A., & Keim, D. A. (2001). On the surprising behavior of distance metrics in high dimensional spaces. Proceedings of the 8th International Conference on Database Theory, 420–434.
                  6. Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
                  7. Ali, M., Fromm, M., Thellmann, K., Rutmann, R., Lübbering, M., Leveling, J., Klug, K., Ebert, J., Doll, N., Buschhoff, J., Jain, C., Weber, A., Jurkschat, L., Abdelwahab, H., John, C., Ortiz Suarez, P., Ostendorff, M., Weinbach, S., Sifa, R., … Flores-Herr, N. (2024). Tokenizer choice for LLM training: negligible or crucial? In K. Duh, H. Gomez, & S. Bethard (Eds.), Findings of the Association for Computational Linguistics: NAACL 2024 (pp. 3907–3924). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-naacl.247
                  8. Angelopoulos, A. N., & Bates, S. (2022). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. http://arxiv.org/abs/2107.07511
                  9. Angwin, J., Larson, J., Mattu, S., Kirchner, L., & ProPublica. (2016). Machine Bias. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
                  10. Arnett, C., Chang, T. A., & Bergen, B. (2024). A Bit of a Problem: Measurement Disparities in Dataset Sizes across Languages. In M. Melero, S. Sakti, & C. Soria (Eds.), Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024 (pp. 1–9). ELRA and ICCL. https://aclanthology.org/2024.sigul-1.1/
                  11. Arnett, C., & Bergen, B. (2025). Why do language models perform worse for morphologically complex languages? In O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, & S. Schockaert (Eds.), Proceedings of the 31st International Conference on Computational Linguistics (pp. 6607–6623). Association for Computational Linguistics. https://aclanthology.org/2025.coling-main.441/
                  12. Arora, S., Liang, Y., & Ma, T. (2017, February 6). A simple but tough-to-beat baseline for sentence embeddings. https://openreview.net/forum?id=SyK00v5xx
                  13. Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
                  14. Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
                  15. Aumüller, M., Bernhardsson, E., & Faithfull, A. (2018). ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. https://doi.org/10.48550/arXiv.1807.05614
                  16. Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
                  17. Bach, F. (2024). Learning Theory from First Principles. The MIT Press.
                  18. Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Y. Bengio & Y. LeCun (Eds.), Conference Track Proceedings of the 3rd International Conference on Learning Representations. http://arxiv.org/abs/1409.0473
                  19. Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., … Kaplan, J. (2022). Constitutional AI: harmlessness from AI feedback. https://doi.org/10.48550/arXiv.2212.08073
                  20. Baker, A. (2022). Simplicity. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2022). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2022/entries/simplicity/
                  21. 北川 源四郎, 竹村 彰通, 赤穂 昭太郎, 今泉 允聡, 内田 誠一, 清 智也, 高野 渉, 辻 真吾, 原 尚幸, 久野 遼平, 松原 仁, 宮地 充子, 森畑 明昌, & 宿久 洋. (2023). 応用基礎としてのデータサイエンス AI×データ活用の実践. 講談社.
                  22. Belkin, M., Rakhlin, A., & Tsybakov, A. B. (2019). Does data interpolation contradict statistical optimality? The 22nd International Conference on Artificial Intelligence and Statistics, 1611–1619. http://proceedings.mlr.press/v89/belkin19a.html
                  23. Bell, E., Bryman, A., & Harley, B. (2022). Business Research Methods (6th ed.). Oxford Univ Press.
                  24. Be My Eyes (Ed.). (2024). Be My Eyes Accessibility with GPT-4o. https://www.youtube.com/watch?v=Zq710AKC1gg
                  25. Bergstra, J., & Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13.
                  26. Bloomberg. (2024). Generative AI 2024 report: assessing opportunities and disruptions in an evolving trillion-dollar market. https://www.bloomberg.com/professional/products/bloomberg-terminal/research/bloomberg-intelligence/download/generative-ai-2024-report/
                  27. Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
                  28. Bridle, J. S. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. F. Soulié & J. Hérault (Eds.), Neurocomputing (pp. 227–236). Springer. https://doi.org/10.1007/978-3-642-76153-9_28
                  29. Brodersen, K. H., Ong, C. S., Stephan, K. E., & Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. 2010 20th International Conference on Pattern Recognition, 3121–3124. https://doi.org/10.1109/ICPR.2010.764
                  30. Brynjolfsson, E., Li, D., & Raymond, L. R. (2023). Generative AI at work (Number 31161) [Working Paper]. https://doi.org/10.3386/w31161
                  31. Brynjolfsson, E., Mitchell, T., & Rock, D. (2018). What can machines learn, and what does it mean for occupations and the economy? AEA Papers and Proceedings, 108, 43–47. https://doi.org/10.1257/pandp.20181019
                  32. Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T., & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. https://doi.org/10.48550/arXiv.2303.12712
                  33. C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
                  34. Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
                  35. Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
                  36. Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
                  37. Cazzaniga, M. (2024). Gen-AI: artificial intelligence and the future of work. Staff Discussion Notes, 2024(001), 1. https://doi.org/10.5089/9798400262548.006
                  38. Chan, C., Ginosar, S., Zhou, T., & Efros, A. (2019). Everybody dance now. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 5932–5941. https://doi.org/10.1109/ICCV.2019.00603
                  39. Chapagain, D., Kshetri, N., & Aryal, B. (2024). Deepfake disasters: a comprehensive review of technology, ethical concerns, countermeasures, and societal implications. 2024 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), 1–9. https://doi.org/10.1109/ETNCC63262.2024.10767452
                  40. Chen, C. C., & Meindl, J. R. (1991). The construction of leadership images in the popular press: the case of Donald Burr and People Express. Administrative Science Quarterly, 36(4), 521. https://doi.org/10.2307/2393273
                  41. Chen, S., Wong, S., Chen, L., & Tian, Y. (2023). Extending context window of large language models via positional interpolation. https://doi.org/10.48550/arXiv.2306.15595
                  42. Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html
                  43. 赤穂 昭太郎. (2008). カーネル多変量解析―非線形データ解析の新しい展開. 岩波書店.
                  44. Chouldechova, A. (2017). Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data, 5(2), 153–163. https://doi.org/10.1089/big.2016.0047
                  45. Çinlar, E. (2011). Probability and Stochastics. Springer New York. https://doi.org/10.1007/978-0-387-87859-1
                  46. Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111. https://doi.org/10.1103/PhysRevE.70.066111
                  47. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. (2022). Engineering Applications of Artificial Intelligence, 110, 104743. https://doi.org/10.1016/j.engappai.2022.104743
                  48. Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., & Huq, A. (2017). Algorithmic decision making and the cost of fairness. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 797–806. https://doi.org/10.1145/3097983.3098095
                  49. Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory (2nd ed). Wiley-Interscience.
                  50. Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
                  51. Creswell, J. W., & Plano Clark, V. L. (2017). Designing and Conducting Mixed Methods Research (3rd ed.). SAGE Publications.
                  52. Cui, Z. (K., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The effects of generative AI on high-skilled work: evidence from three field experiments with software developers (Number 4945566) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4945566
                  53. 村上 正康, 稲葉 尚志, & 野沢 宗平. (1989). 演習 線形代数 (改訂版). 培風館.
                  54. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 1, 886–893 vol. 1. https://doi.org/10.1109/CVPR.2005.177
                  55. Dathathri, S., See, A., Ghaisas, S., Huang, P.-S., McAdam, R., Welbl, J., Bachani, V., Kaskasoli, A., Stanforth, R., Matejovicova, T., Hayes, J., Vyas, N., Merey, M. A., Brown-Cohen, J., Bunel, R., Balle, B., Cemgil, T., Ahmed, Z., Stacpoole, K., … Kohli, P. (2024). Scalable watermarking for identifying large language model outputs. Nature, 634(8035), 818–823. https://doi.org/10.1038/s41586-024-08025-4
                  56. De Cock, D. (2011). Ames, Iowa: alternative to the Boston housing data as an end of semester regression project. Journal of Statistics Education, 19(3), 8. https://doi.org/10.1080/10691898.2011.11889627
                  57. DeepETA: How Uber Predicts Arrival Times Using Deep Learning. (2022). Uber Blog. https://www.uber.com/blog/deepeta-how-uber-predicts-arrival-times/
                  58. DeepSeek-AI, Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z. F., Gou, Z., Shao, Z., Li, Z., Gao, Z., … Zhang, Z. (2025). DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. https://doi.org/10.48550/arXiv.2501.12948
                  59. Defazio, A., Yang, X. A., Khaled, A., Mishchenko, K., Mehta, H., & Cutkosky, A. (2024, November 6). The road less scheduled. https://openreview.net/forum?id=0XeNkkENuI
                  60. delving_2025_july.png. (2025). GitHub. https://github.com/berenslab/llm-excess-vocab/blob/main/figures/post-publication-updates/delving_2025_july.png
                  61. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. https://doi.org/10.1109/CVPR.2009.5206848
                  62. Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU
                  63. 電子情報通信学会. (2019). 7群-7編-1章 エージェントの定義・モデル・概念. In 電子情報通信学会知識ベース (ver.1 ed., Number 7群7編). https://www.ieice-hbkb.org/files/ad_base/view_pdf.html?p=/files/07/07gun_07hen_01.pdf
                  64. Distributional Hypothesis - ACL Wiki. Retrieved June 22, 2025, from https://www.aclweb.org/aclwiki/index.php?title=Distributional_Hypothesis
                  65. Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
                  66. Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazaré, P.-E., Lomeli, M., Hosseini, L., & Jégou, H. (2025). The Faiss library. https://doi.org/10.48550/arXiv.2401.08281
                  67. drhead. (2024). The VAE used for Stable Diffusion 1.x/2.x and other models (KL-F8) has a critical flaw, probably due to bad training, that is holding back all models that use it (almost certainly including DALL-E 3). [Reddit Post]. r/StableDiffusion. https://www.reddit.com/r/StableDiffusion/comments/1ag5h5s/the_vae_used_for_stable_diffusion_1x2x_and_other/
                  68. The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
                  69. Dutta, D., Ansari, F., Chakrabarty, A., & Das, S. (2025). On the existence of universal simulators of attention. https://doi.org/10.48550/arXiv.2506.18739
                  70. Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
                  71. Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
                  72. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: an early look at the labor market impact potential of large language models. https://doi.org/10.48550/arXiv.2303.10130
                  73. Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2024). GPTs are GPTs: Labor market impact potential of LLMs. Science, 384(6702), 1306–1308. https://doi.org/10.1126/science.adj0998
                  74. Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
                  75. Entezari, R., Wortsman, M., Saukh, O., Shariatnia, M. M., Sedghi, H., & Schmidt, L. (2023). The role of pre-training data in transfer learning. https://doi.org/10.48550/arXiv.2302.13602
                  76. Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining.
                  77. 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
                  78. Fedus, W., Zoph, B., & Shazeer, N. (2022). Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23(1), 120:5232–5120:5270.
                  79. Felten, E. W., Raj, M., & Seamans, R. (2023). How will Language Modelers like ChatGPT Affect Occupations and Industries? (Number 4375268) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4375268
                  80. Felten, E. W., Raj, M., & Seamans, R. (2018). A method to link advances in artificial intelligence to occupational abilities. AEA Papers and Proceedings, 108, 54–57. https://doi.org/10.1257/pandp.20181021
                  81. Felten, E. W., Raj, M., & Seamans, R. (2023). Occupational heterogeneity in exposure to Generative AI (Number 4414065) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4414065
                  82. Fernando, T., Priyasad, D., Sridharan, S., Ross, A., & Fookes, C. (2025). Face deepfakes: a comprehensive review. https://doi.org/10.48550/arXiv.2502.09812
                  83. Filippucci, F., Gal, P., Jona-Lasinio, C., Leandro, A., & Nicoletti, G. (2024). The impact of artificial intelligence on productivity, distribution and growth: key mechanisms, initial evidence and policy challenges (No.15; OECD Artificial Intelligence Papers, Number 15). OECD Publishing. https://doi.org/10.1787/8d900037-en
                  84. Filippucci, F., Gal, P., & Schief, M. (2024). Miracle or myth? Assessing the macroeconomic productivity gains from artificial intelligence (No.29; OECD Artificial Intelligence Papers, Number 29). OECD Publishing. https://doi.org/10.1787/b524a072-en
                  85. Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2021). Sharpness-Aware Minimization for Efficiently Improving Generalization. https://doi.org/10.48550/arXiv.2010.01412
                  86. Forina, M., Armanino, C., Castino, M., & Ubigli, M. (1986). Multivariate data analysis as a discriminating method of the origin of wines. VITIS - Journal of Grapevine Research, 25(3), 189–189. https://doi.org/10.5073/vitis.1986.25.189-201
                  87. Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
                  88. Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning (Vol. 1). Springer series in statistics New York. http://statweb.stanford.edu/ tibs/book/preface.ps
                  89. 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/
                  90. Fu, L., Yang, B., Kuang, Z., Song, J., Li, Y., Zhu, L., Luo, Q., Wang, X., Lu, H., Huang, M., Li, Z., Tang, G., Shan, B., Lin, C., Liu, Q., Wu, B., Feng, H., Liu, H., Huang, C., … Bai, X. (2024). OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning (Version 1). https://doi.org/10.48550/arXiv.2501.00321
                  91. 福水 健次. (2010). カーネル法入門―正定値カーネルによるデータ解析. 朝倉書店.
                  92. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018, September 27). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. https://openreview.net/forum?id=Bygh9j09KX
                  93. Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z
                  94. Gemini Team, Google. (2025). Gemini 2.5: pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities. https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf
                  95. Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
                  96. Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
                  97. Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826. https://doi.org/10.1073/pnas.122653799
                  98. Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494), 746–762. https://doi.org/10.1198/jasa.2011.r10138
                  99. Goldblum, M., Finzi, M., Rowan, K., & Wilson, A. G. (2024). Position: the no free lunch theorem, Kolmogorov complexity, and the role of inductive biases in machine learning. Proceedings of the 41st International Conference on Machine Learning (Position Paper Track), 235, 15788–15808.
                  100. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. http://www.deeplearningbook.org
                  101. Gottesman, Y. (2023). Understand diffusion models with VAEs. Yoni Gottesman. https://yonigottesman.github.io/2023/03/11/vae.html
                  102. GO株式会社. (2023). タクシーアプリ『GO』の データ基盤の全体像. https://www.slideshare.net/slideshow/ss-258369181/258369181
                  103. Guan, M. Y., Joglekar, M., Wallace, E., Jain, S., Barak, B., Helyar, A., Dias, R., Vallone, A., Ren, H., Wei, J., Chung, H. W., Toyer, S., Heidecke, J., Beutel, A., & Glaese, A. (2025). Deliberative alignment: reasoning enables safer language models. https://doi.org/10.48550/arXiv.2412.16339
                  104. Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. Proceedings of the 34th International Conference on Machine Learning - Volume 70, 1321–1330.
                  105. Hall, P., Marron, J. S., & Neeman, A. (2005). Geometric representation of high dimension, low sample size data. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(3), 427–444. https://doi.org/10.1111/j.1467-9868.2005.00510.x
                  106. Handa, K., Tamkin, A., McCain, M., Huang, S., Durmus, E., Heck, S., Mueller, J., Hong, J., Ritchie, S., Belonax, T., Troy, K. K., Amodei, D., Kaplan, J., Clark, J., & Ganguli, D. (2025). Which economic tasks are performed with AI? Evidence from millions of Claude conversations. https://doi.org/10.48550/arXiv.2503.04761
                  107. Hand, M. (2014). From Cyberspace to the Dataverse: Trajectories in Digital Social Research. In Big Data? Qualitative Approaches to Digital Research (Vol. 13, pp. 1–27). Emerald Group Publishing Limited. https://doi.org/10.1108/S1042-319220140000013002
                  108. Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
                  109. Hayou, S., Ghosh, N., & Yu, B. (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proceedings of the 41st International Conference on Machine Learning, 17783–17806. https://proceedings.mlr.press/v235/hayou24a.html
                  110. 黒田 成俊. (1980). 関数解析 (Number 15). 共立出版.
                  111. 横井 祥. (2024). 「確率的なオウム」にできること、またそれがなぜできるのかについて. https://speakerdeck.com/eumesy/language-models-as-modern-version-of-the-use-theory-of-meaning
                  112. Henighan, T., Kaplan, J., Katz, M., Chen, M., Hesse, C., Jackson, J., Jun, H., Brown, T. B., Dhariwal, P., Gray, S., Hallacy, C., Mann, B., Radford, A., Ramesh, A., Ryder, N., Ziegler, D. M., Schulman, J., Amodei, D., & McCandlish, S. (2020). Scaling laws for autoregressive generative modeling. https://doi.org/10.48550/arXiv.2010.14701
                  113. Hermann, K., Mobahi, H., Fel, T., & Mozer, M. C. (2023, October 13). On the foundations of shortcut learning. https://openreview.net/forum?id=Tj3xLVuE9f
                  114. Hernandez, D., Kaplan, J., Henighan, T., & McCandlish, S. (2021). Scaling laws for transfer. https://doi.org/10.48550/arXiv.2102.01293
                  115. Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. https://doi.org/10.48550/arXiv.1207.0580
                  116. Ho, J., Jain, A., & Abbeel, P. Denoising diffusion probabilistic models.
                  117. An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
                  118. Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
                  119. Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2019, September 25). The curious case of neural text degeneration. https://openreview.net/forum?id=rygGQyrFvH
                  120. Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2022). LoRA: Low-rank adaptation of large language models. International Conference on Learning Representations. https://openreview.net/forum?id=nZeVKeeFYf9
                  121. Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
                  122. Hu, K., & Hu, K. (2023). ChatGPT sets record for fastest-growing user base - analyst note. Reuters. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
                  123. 会計担当が38億円を詐欺グループに送金、ビデオ会議のCFOは偽物. (2024). CNN.co.jp. https://www.cnn.co.jp/world/35214839.html
                  124. Hutchins, J. (1995). "The whisky was invisible", or Persistent myths of MT. MT News International, 11. https://web.archive.org/web/20210103041306/http://www.hutchinsweb.me.uk/MTNI-11-1995.pdf
                  125. Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., & Wilson, A. G. (2018). Averaging weights leads to wider optima and better generalization. Proceedings of the Conference on Uncertainty in Artificial Intelligence.
                  126. Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1), 2. https://doi.org/10.3390/technologies9010002
                  127. James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An Introduction to Statistical Learning, with Applications in Python. Springer.
                  128. Jin, M., Yu, Q., Shu, D., Zhao, H., Hua, W., Meng, Y., Zhang, Y., & Du, M. (2024). The impact of reasoning step length on large language models (L.-W. Ku, A. Martins, & V. Srikumar, Eds.; pp. 1830–1842). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.108
                  129. 金森 敬文, 鈴木 大慈, 竹内 一郎, & 佐藤 一誠. (2016). 機械学習のための連続最適化. 講談社.
                  130. 金森 敬文. (2015). 統計的学習理論. 講談社.
                  131. Jocher, G., Qiu, J., & Chaurasia, A. (2023). Ultralytics YOLO (Version 8.0.0). https://github.com/ultralytics/ultralytics
                  132. John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
                  133. Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
                  134. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2
                  135. Kallini, J., Papadimitriou, I., Futrell, R., Mahowald, K., & Potts, C. (2024). Mission: Impossible language models. In L.-W. Ku, A. Martins, & V. Srikumar (Eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 14691–14714). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-long.787
                  136. Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
                  137. Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
                  138. Khan, F. (2025). FareedKhan-dev/train-deepseek-r1. https://github.com/FareedKhan-dev/train-deepseek-r1
                  139. Kim, J. (2025). kjsman/stable-diffusion-pytorch. https://github.com/kjsman/stable-diffusion-pytorch
                  140. Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. Proceedings of 3rd International Conference for Learning Representations. http://arxiv.org/abs/1412.6980
                  141. Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
                  142. Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
                  143. Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1). https://doi.org/10.1177/2053951714528481
                  144. Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. In C. H. Papadimitriou (Ed.), 8th Innovations in Theoretical Computer Science Conference (Vol. 67, pp. 43:1–43:23). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Germany. https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2017.43
                  145. Kobak, D., González-Márquez, R., Horvát, E.-Á., & Lause, J. (2025). Delving into LLM-assisted writing in biomedical publications through excess vocabulary. Science Advances, 11(27), eadt3813. https://doi.org/10.1126/sciadv.adt3813
                  146. Koenker, R. (2005). Quantile Regression (Number 38). Cambridge University Press.
                  147. Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643
                  148. Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
                  149. Kotz, S., Balakrishnan, N., Read, C. B., & Vidakovic, B. (Eds.). (2006). Encyclopedia of Statistical Sciences (2nd ed). Wiley-Interscience.
                  150. Lambert, N., Morrison, J., Pyatkin, V., Huang, S., Ivison, H., Brahman, F., Miranda, L. J. V., Liu, A., Dziri, N., Lyu, S., Gu, Y., Malik, S., Graf, V., Hwang, J. D., Yang, J., Bras, R. L., Tafjord, O., Wilhelm, C., Soldaini, L., … Hajishirzi, H. (2025). Tulu 3: Pushing Frontiers in Open Language Model Post-Training. https://doi.org/10.48550/arXiv.2411.15124
                  151. Lane, M. (2024). Who will be the workers most affected by AI? A closer look at the impact of AI on women, low-skilled workers and other groups (Technical Report No.26; Number 26). OECD Publishing. https://doi.org/10.1787/14dc6f89-en
                  152. Learn Prompting: Your Guide to Communicating with AI. (2023). https://learnprompting.org/
                  153. Lee, K.-H., Fischer, I., Wu, Y.-H., Marwood, D., Baluja, S., Schuurmans, D., & Chen, X. (2025). Evolving deeper LLM thinking. https://doi.org/10.48550/arXiv.2501.09891
                  154. Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915
                  155. Li, P., Yang, J., Islam, M. A., & Ren, S. (2025). Making AI less ’thirsty.’ Commun. ACM, 68(7), 54–61. https://doi.org/10.1145/3724499
                  156. 鈴木 大慈. (2015). 確率的最適化. 講談社.
                  157. Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
                  158. Lipsey, R. G., Carlaw, K. I., & Bekar, C. T. (2005). Economic Transformations: General Purpose Technologies and Long-Term Economic Growth. Oxford University Press. https://doi.org/10.1093/oso/9780199285648.001.0001
                  159. Liu, R., Gao, J., Zhao, J., Zhang, K., Li, X., Qi, B., Ouyang, W., & Zhou, B. (2025). Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling. https://doi.org/10.48550/arXiv.2502.06703
                  160. Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
                  161. Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
                  162. Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018). Visualizing the loss landscape of neural nets. Advances in Neural Information Processing Systems, 31. https://proceedings.neurips.cc/paper_files/paper/2018/hash/a41b3bb3e6b050b6c9067c67f663b915-Abstract.html
                  163. Luenberger, D. G. (1997). Optimization by Vector Space Methods (1st ed.). John Wiley & Sons, Inc.
                  164. Lu, Y., & Morgan, J. L. (2020). Homophone auditory processing in cross-linguistic perspective. Proceedings of the Linguistic Society of America, 5(1), 529–542. https://doi.org/10.3765/plsa.v5i1.4733
                  165. Mäkelä, E., & Stephany, F. (2025). Complement or substitute? How AI increases the demand for human skills (Number 5153230) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.5153230
                  166. Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. https://doi.org/10.48550/arXiv.1603.09320
                  167. Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT
                  168. Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
                  169. Manzini, A., Keeling, G., Alberts, L., Vallor, S., Morris, M. R., & Gabriel, I. (2024). The code that binds us: navigating the appropriateness of human-AI assistant relationships. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 943–957. https://doi.org/10.1609/aies.v7i1.31694
                  170. Marx, K. (1959). Economic & Philosophic Manuscripts of 1844 (M. Milligan, Tran.). Progress Publishers. https://www.marxists.org/archive/marx/works/1844/manuscripts/preface.htm
                  171. Mata v. Avianca, Inc. (Number 1:22-cv-01461). (Number). District Court, S.D. New York. Retrieved July 13, 2025, from https://www.courtlistener.com/docket/63107798/mata-v-avianca-inc/
                  172. 梅谷 俊治. (2020). しっかり学ぶ数理最適化 モデルからアルゴリズムまで. 講談社.
                  173. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, 2, 3111–3119.
                  174. Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
                  175. Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
                  176. Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
                  177. Moffatt v. Air Canada. (2024). In CanLII (Vol. 149, Number SC-2023-005609). BCCRT. https://canlii.ca/t/k2spq
                  178. Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of Machine Learning (2nd ed.). The MIT Press.
                  179. Molnar, C. Interpretable Machine Learning. Retrieved February 23, 2025, from https://christophm.github.io/interpretable-ml-book/
                  180. Molnar, C. Interpretable Machine Learning(邦訳). Retrieved February 23, 2025, from https://hacarus.github.io/interpretable-ml-book-ja/index.html
                  181. Muennighoff, N., Yang, Z., Shi, W., Li, X. L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E., & Hashimoto, T. (2025). s1: Simple test-time scaling. https://doi.org/10.48550/arXiv.2501.19393
                  182. Muennighoff, N., Rush, A. M., Barak, B., Scao, T. L., Tazi, N., Piktus, A., Pyysalo, S., Wolf, T., & Raffel, C. (2023, November 2). Scaling data-constrained language models. https://openreview.net/forum?id=j5BuTrEj35
                  183. Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P. H. S., & Dokania, P. K. (2020). Calibrating deep neural networks using focal loss. 15288–15299.
                  184. Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406
                  185. ndl-lab/pdmocrdataset-part1. (2024). ndl-lab. https://github.com/ndl-lab/pdmocrdataset-part1
                  186. Newman, M. E. J. (2004). Fast algorithm for detecting community structure in networks. Physical Review E, 69(6), 066133. https://doi.org/10.1103/PhysRevE.69.066133
                  187. Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113. https://doi.org/10.1103/PhysRevE.69.026113
                  188. Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
                  189. Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
                  190. No Free Lunch Theorems. Retrieved February 15, 2025, from http://www.no-free-lunch.org/
                  191. Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654), 187–192. https://doi.org/10.1126/science.adh2586
                  192. NVIDIA announces financial results for first quarter fiscal 2026. (2025). NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-first-quarter-fiscal-2026
                  193. O’Brien, P. C., & Fleming, T. R. (1979). A multiple testing procedure for clinical trials. Biometrics, 35(3), 549–556. https://doi.org/10.2307/2530245
                  194. Odena, A., Dumoulin, V., & Olah, C. (2016). Deconvolution and checkerboard artifacts. Distill, 1(10), e3. https://doi.org/10.23915/distill.00003
                  195. Oikarinen, T., & Weng, T.-W. (2022, September 29). CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks. https://openreview.net/forum?id=iPWiwWHc1V
                  196. OpenAI Platform. Retrieved August 29, 2024, from https://platform.openai.com
                  197. OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
                  198. OpenAI. (2024). Learning to reason with LLMs. https://openai.com/index/learning-to-reason-with-llms/
                  199. Osband, K. H. (1985). Providing Incentives for Better Cost Forecasting [Phdthesis]. University of California, Berkeley.
                  200. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
                  201. OWASP. (2024). OWASP Top 10 for LLM applications & generative AI (Technical Report OWASP PDF v4.2.0a 20241114-202703; Number OWASP PDF v4.2.0a 20241114-202703).
                  202. Parry, R. M., Jones, W., Stokes, T. H., Phan, J. H., Moffitt, R. A., Fang, H., Shi, L., Oberthuer, A., Fischer, M., Tong, W., & Wang, M. D. (2010). k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. The Pharmacogenomics Journal, 10(4), 292–309. https://doi.org/10.1038/tpj.2010.56
                  203. Pelletier, B. (2024). On the statistical properties of the isolation forest anomaly detection method. https://hal.science/hal-04430185
                  204. Penedo, G., Malartic, Q., Hesslow, D., Cojocaru, R., Cappelli, A., Alobeidli, H., Pannier, B., Almazrouei, E., & Launay, J. (2023). The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. https://doi.org/10.48550/arXiv.2306.01116
                  205. Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. (2023). The impact of AI on developer productivity: evidence from GitHub Copilot. https://doi.org/10.48550/arXiv.2302.06590
                  206. Petersen, K. B., & Pedersen, M. S. (2012/nov). The Matrix Cookbook. Technical University of Denmark. http://www2.compute.dtu.dk/pubdb/pubs/3274-full.html
                  207. 坪井 祐太, 海野 裕也, & 鈴木 潤. (2017). 深層学習による自然言語処理. 講談社.
                  208. Plesner, A., Vontobel, T., & Wattenhofer, R. (2024). Breaking reCAPTCHAv2. https://doi.org/10.1109/COMPSAC61105.2024.00142
                  209. Murphy, K. P. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. https://probml.github.io/pml-book/book1.html
                  210. Prompt Engineering Guide – Nextra. Retrieved August 29, 2024, from https://www.promptingguide.ai/
                  211. Proschan, M. A., Lan, K. K. G., & Wittes, J. T. (2006). Statistical Monitoring of Clinical Trials: A Unified Approach. Springer.
                  212. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learning, 8748–8763. https://proceedings.mlr.press/v139/radford21a.html
                  213. Rafailov, R., Hejna, J., Park, R., & Finn, C. (2024, August 26). From \r to {Q^*\: Your Language Model is Secretly a Q-Function. https://openreview.net/forum?id=kEVcNxtqXk
                  214. Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
                  215. Raghu, M., Zhang, C., Kleinberg, J., & Bengio, S. (2019). Transfusion: understanding transfer learning for medical imaging. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 301, pp. 3347–3357). Curran Associates Inc.
                  216. Raschka, S., & Mirjalili, V. (2020). Python機械学習プログラミング:達人データサイエンティストによる理論と実践 (福島 真太朗 & 株式会社クイープ, Trans.; 第3版). インプレス.
                  217. Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
                  218. Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
                  219. A generalist agent. (2022). Transactions on Machine Learning Research. https://openreview.net/forum?id=1ikK0kHjvj
                  220. 人工知能関連技術の研究開発及び活用の推進に関する法律, (2025). https://laws.e-gov.go.jp/law/507AC0000000053
                  221. 人工知能学会. (2023年5月版). AIマップβ2.0. https://www.ai-gakkai.or.jp/aimap/
                  222. 日本放送協会. (2025). 「性的ディープフェイク」で行政罰の条例改正案を可決 鳥取県. NHKニュース. https://www3.nhk.or.jp/news/html/20250630/k10014848661000.html
                  223. Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr., 3(4), 333–389. https://doi.org/10.1561/1500000019
                  224. Rokach, L., & Maimon, O. (2014). Data Mining with Decision Trees: Theory and Applications (2nd ed., Vol. 81). WORLD SCIENTIFIC. https://doi.org/10.1142/9097
                  225. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. 10674–10685. https://doi.org/10.1109/CVPR52688.2022.01042
                  226. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (pp. 234–241). Springer International Publishing. https://doi.org/10.1007/978-3-319-24574-4_28
                  227. R, F. I. R. T. H. J. (1957). A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis. https://cir.nii.ac.jp/crid/1570854175539816192?lang=en
                  228. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252. https://doi.org/10.1007/s11263-015-0816-y
                  229. Sammut, C., & Webb, G. I. (Eds.). (2017). Encyclopedia of Machine Learning and Data Mining. Springer US. https://doi.org/10.1007/978-1-4899-7687-1
                  230. Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
                  231. Sardana, N., Portes, J., Doubov, S., & Frankle, J. (2024). Beyond Chinchilla-optimal: accounting for inference in language model scaling laws. Proceedings of the 41st International Conference on Machine Learning, 235, 43445–43460.
                  232. Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
                  233. Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst., 42(3), 19:1–19:21. https://doi.org/10.1145/3068335
                  234. Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., Li, Y., Gupta, A., Han, H. J., Schulhoff, S., Dulepet, P. S., Vidyadhara, S., Ki, D., Agrawal, S., Pham, C., Kroiz, G., Li, F., Tao, H., Srivastava, A., … Resnik, P. (2024). The Prompt Report: A Systematic Survey of Prompting Techniques. http://arxiv.org/abs/2406.06608
                  235. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
                  236. Seber, G. A. F., & Lee, A. J. (2003). Linear Regression Analysis (2nd ed.). Wiley.
                  237. Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. https://www.cs.huji.ac.il/ shais/UnderstandingMachineLearning/copy.html
                  238. 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
                  239. 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
                  240. 山下 信雄. (2015). 非線形計画法. 朝倉書店. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003278170
                  241. Sharif, M., Bhagavatula, S., Bauer, L., & Reiter, M. K. (2016). Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 1528–1540. https://doi.org/10.1145/2976749.2978392
                  242. 神嶌 敏弘. (2019). 変わりゆく機械学習と変わらない機械学習. 日本物理学会誌, 74(1), 5–13. https://doi.org/10.11316/butsuri.74.1_5
                  243. 生成AIでランサムウェアを作成した容疑者の摘発事例を考察. (2025). Trend Micro. https://www.trendmicro.com/ja_jp/jp-security/24/e/breaking-securitynews-20240529-02.html
                  244. 生成AI悪用し楽天モバイルに不正アクセス、1000件以上の回線入手し転売か…容疑で中高生3人逮捕. (2025). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20250226-OYT1T50205/
                  245. 生成AI悪用しウイルス作成、有罪判決…IT知識なくとも「1か月ぐらいで簡単に作れた」. (2024). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20241025-OYT1T50209/
                  246. 石黒 勝彦, & 林 浩平. (2016). 関係データ学習. 講談社.
                  247. 石井 健一郎, & 上田 修功. (2014). 教師なし学習入門 (わかりやすいパターン認識 続). オーム社.
                  248. Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. R., & Yao, S. (2023, November 2). Reflexion: language agents with verbal reinforcement learning. https://openreview.net/forum?id=vAElhFcKW6
                  249. SIGKDD News: 2014 SIGKDD Test of Time Award. (2014). https://www.kdd.org/News/view/2014-sigkdd-test-of-time-award
                  250. Singhal, S., Zeng, J., Bukharin, A., Zhang, Y., Shen, G., Mahabaleshwarkar, A. S., Kartal, B., Suhara, Y., Bercovich, A., Levy, I., Golan, I., Dabbah, M., El-Yaniv, R., Majumdar, S., Gitman, I., Bakhturina, E., Zhang, J. J., Su, B.-Y., Huang, G., … Konuk, T. (2025, June 21). Llama-Nemotron: Efficient Reasoning Models. https://openreview.net/forum?id=ev1xpo9mbI&referrer=%5Bthe%20profile%20of%20Olivier%20Delalleau%5D(%2Fprofile%3Fid%3D Olivier_Delalleau1)
                  251. Snell, C. V., Lee, J., Xu, K., & Kumar, A. (2024, October 4). Scaling LLM test-time compute optimally can be more effective than scaling parameters for reasoning. https://openreview.net/forum?id=4FWAwZtd2n
                  252. Song, Y., & Ermon, S. (2019). Generative modeling by estimating gradients of the data distribution. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1067, pp. 11918–11930). Curran Associates Inc.
                  253. Song, Y. (2021). Generative modeling by estimating gradients of the data distribution. Yang Song. https://yang-song.net/blog/2021/score
                  254. Song, Y., Durkan, C., Murray, I., & Ermon, S. (2021, November 9). Maximum likelihood training of score-based diffusion models. https://openreview.net/forum?id=AklttWFnxS9
                  255. Speech and Language Processing. Retrieved June 18, 2025, from https://web.stanford.edu/ jurafsky/slp3/
                  256. Steinwart, I., Pasin, C., Williamson, R., & Zhang, S. (2014). Elicitation and identification of properties. Proceedings of The 27th Conference on Learning Theory, 482–526. https://proceedings.mlr.press/v35/steinwart14.html
                  257. Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
                  258. Storchan, V., Kumar, R., Chowdhury, R., Goldfarb-Tarrant, S., & Cattell, S. (2024). Generative AI red teaming challenge: transparency report [Technical Report]. DEF CON. https://drive.google.com/file/d/1JqpbIP6DNomkb32umLoiEPombK2-0Rc-/view
                  259. Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., & Liu, Y. (2024). RoFormer: enhanced transformer with rotary position embedding. Neurocomputing, 568(C). https://doi.org/10.1016/j.neucom.2023.127063
                  260. Szeliski, R. (2022). Computer Vision: Algorithms and Applications. Springer. https://szeliski.org/Book/
                  261. Tabassi, E. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology. https://doi.org/10.6028/nist.ai.100-1
                  262. Takami Sato. (2024). 最適化超入門. https://speakerdeck.com/tkm2261/zui-shi-hua-chao-ru-men
                  263. Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
                  264. Tamkin, A., McCain, M., Handa, K., Durmus, E., Lovitt, L., Rathi, A., Huang, S., Mountfield, A., Hong, J., Ritchie, S., Stern, M., Clarke, B., Goldberg, L., Sumers, T. R., Mueller, J., McEachen, W., Mitchell, W., Carter, S., Clark, J., … Ganguli, D. (2024). Clio: privacy-preserving insights into real-world ai use. https://doi.org/10.48550/arXiv.2412.13678
                  265. Taori, R., Gulrajani, I., Zhang, T., Dubois, Y., Li, X., Guestrin, C., Liang, P., & Hashimoto, T. B. (2023). Stanford Alpaca: An Instruction-following LLaMA model [Data set]. stanford_alpaca.
                  266. 特許法. Retrieved July 18, 2025, from https://laws.e-gov.go.jp/law/334AC0000000121#Mp-Ch_1
                  267. 特許庁. (2019). 特許・実用新案審査ハンドブック 附属書B 第1章 コンピュータソフトウエア関連発明. https://www.jpo.go.jp/system/laws/rule/guideline/patent/handbook_shinsa/document/index/app_b1.pdf
                  268. Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
                  269. THU-MIG/yolov10. (2024). THU-MIG. https://github.com/THU-MIG/yolov10
                  270. Todeschini, R., Ballabio, D., & Consonni, V. (2020). Distances and similarity measures in chemometrics and chemoinformatics. In Encyclopedia of Analytical Chemistry (pp. 1–40). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470027318.a9438.pub2
                  271. Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
                  272. Ultralytics. Ultralytics YOLO11 object detection model. Retrieved June 12, 2025, from https://github.com/ultralytics/ultralytics/blob/da98efc61d9e0467315fc86c2297c8d81e656b1a/ultralytics/cfg/models/11/yolo11.yaml
                  273. Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
                  274. Neural discrete representation learning. (2017). Proceedings of the 31st International Conference on Neural Information Processing Systems, 6309–6318.
                  275. Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. Proceedings of the 24th International Conference on Machine Learning, 935–942. https://doi.org/10.1145/1273496.1273614
                  276. Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer New York. https://doi.org/10.1007/978-1-4757-3264-1
                  277. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000–6010.
                  278. Vershynin, R. (2018). High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press.
                  279. Vincent, P. (2011). A connection between score matching and denoising autoencoders. Neural Computation, 23(7), 1661–1674. https://doi.org/10.1162/NECO_a_00142
                  280. Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
                  281. Vovk, V., Gammerman, A., & Shafer, G. (2022). Algorithmic Learning in a Random World. Springer International Publishing. https://doi.org/10.1007/978-3-031-06649-8
                  282. Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint (1st ed.). Cambridge University Press. https://www.cambridge.org/core/product/identifier/9781108627771/type/book
                  283. Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2011). Class imbalance, redux. IEEE 11th International Conference on Data Mining, 754–763. https://doi.org/10.1109/ICDM.2011.33
                  284. Wang, X., & Zhou, D. (2024, November 6). Chain-of-Thought Reasoning Without Prompting. https://openreview.net/forum?id=4Zt7S0B0Jp
                  285. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (T. Linzen, G. Chrupała, & A. Alishahi, Eds.; pp. 353–355). Association for Computational Linguistics. https://doi.org/10.18653/v1/W18-5446
                  286. Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387
                  287. Wang, Y., Yang, Q., Zeng, Z., Ren, L., Liu, L., Peng, B., Cheng, H., He, X., Wang, K., Gao, J., Chen, W., Wang, S., Du, S. S., & Shen, Y. (2025). Reinforcement learning for reasoning in large language models with one training example. https://doi.org/10.48550/arXiv.2504.20571
                  288. Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. https://doi.org/10.48550/arXiv.2405.14458
                  289. Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. Advances in Neural Information Processing Systems, 37, 107984–108011. https://proceedings.neurips.cc/paper_files/paper/2024/hash/c34ddd05eb089991f06f3c5dc36836e0-Abstract-Conference.html
                  290. Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845
                  291. Warner, B., Chaffin, A., Clavié, B., Weller, O., Hallström, O., Taghadouini, S., Gallagher, A., Biswas, R., Ladhak, F., Aarsen, T., Cooper, N., Adams, G., Howard, J., & Poli, I. (2024). Smarter, better, faster, longer: a modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference. https://doi.org/10.48550/arXiv.2412.13663
                  292. Webb, M. (2019). The Impact of Artificial Intelligence on the Labor Market (Number 3482150) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.3482150
                  293. Webb, E. J., Campbell, D. T., Schwartz, R. D., & Sechrest, L. (1966). Unobtrusive Measures: Nonreactive Research in the Social Sciences (pp. xii, 225). Rand Mcnally.
                  294. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.
                  295. Weidinger, L., Barnhart, J., Brennan, J., Butterfield, C., Young, S., Hawkins, W., Hendricks, L. A., Comanescu, R., Chang, O., Rodriguez, M., Beroshi, J., Bloxwich, D., Proleev, L., Chen, J., Farquhar, S., Ho, L., Gabriel, I., Dafoe, A., & Isaac, W. (2024). Holistic safety and responsibility evaluations of advanced AI models. https://doi.org/10.48550/arXiv.2404.14068
                  296. Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
                  297. Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ
                  298. Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb), 207–244. http://www.jmlr.org/papers/v10/weinberger09a.html
                  299. Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.
                  300. 文化審議会著作権分科会法制度小委員会. (2024). AIと著作権に関する考え方について. https://www.bunka.go.jp/seisaku/bunkashingikai/chosakuken/pdf/94037901_01.pdf
                  301. Wen, X., Liu, Z., Zheng, S., Xu, Z., Ye, S., Wu, Z., Liang, X., Wang, Y., Li, J., Miao, Z., Bian, J., & Yang, M. (2025). Reinforcement learning with verifiable rewards implicitly incentivizes correct reasoning in base LLMs. https://doi.org/10.48550/arXiv.2506.14245
                  302. Wettig, A., Gao, T., Zhong, Z., & Chen, D. (2023). Should you mask 15% in masked language modeling? In A. Vlachos & I. Augenstein (Eds.), Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (pp. 2985–3000). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.eacl-main.217
                  303. Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
                  304. Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
                  305. Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3
                  306. workpiles. (2016). CUCUMBER-9. https://github.com/workpiles/CUCUMBER-9
                  307. World Economic Forum. (2025). Future of Jobs Report 2025 [Insight Report]. https://www.weforum.org/publications/the-future-of-jobs-report-2025/digest/
                  308. Writers Guild of America. (2023). Summary of the 2023 WGA MBA. WGA Contract 2023. https://www.wgacontract2023.org/the-campaign/summary-of-the-2023-wga-mba
                  309. Wu, Y., & He, K. (2018). Group normalization. Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XIII, 3–19. https://doi.org/10.1007/978-3-030-01261-8_1
                  310. Wu, T., Lan, J., Yuan, W., Jiao, J., Weston, J. E., & Sukhbaatar, S. (2025, June 18). Thinking LLMs: general instruction following with thought generation. https://openreview.net/forum?id=z6SrgYCdey&noteId=t3y0Ev0lm6
                  311. 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
                  312. Xiong, R., Yang, Y., He, D., Zheng, K., Zheng, S., Xing, C., Zhang, H., Lan, Y., Wang, L., & Liu, T.-Y. (2020). On layer normalization in the transformer architecture. Proceedings of the 37th International Conference on Machine Learning, 119, 10524–10533.
                  313. Xu, D., & Tian, Y. (2015). A Comprehensive Survey of Clustering Algorithms. Annals of Data Science, 2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1
                  314. Xu, H., Xie, S., Tan, X., Huang, P.-Y., Howes, R., Sharma, V., Li, S.-W., Ghosh, G., Zettlemoyer, L., & Feichtenhofer, C. (2023, October 13). Demystifying CLIP data. https://openreview.net/forum?id=5BCFlnfE1g
                  315. 穴井 宏和, & 斉藤 努. (2015). 今日から使える!組合せ最適化 離散問題ガイドブック. 講談社.
                  316. 穴井 宏和. (2013). 数理最適化の実践ガイド. 講談社.
                  317. Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
                  318. Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
                  319. Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/
                  320. 岩田 具治. (2015). トピックモデル. 講談社.
                  321. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
                  322. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
                  323. Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. R. (2023, November 2). Tree of Thoughts: deliberate problem solving with large language models. https://openreview.net/forum?id=5Xc1ecxO1h
                  324. 尭之 新田. 生成AIが描く日本の職業の明暗とその対応策.
                  325. Ye, P., Qian, J., Chen, J., Wu, C.-hung, Zhou, Y., De Mars, S., Yang, F., & Zhang, L. (2018). Customized Regression Model for Airbnb Dynamic Pricing. 932–940. https://doi.org/10.1145/3219819.3219830
                  326. 原田 達也. (2017). 画像認識. 講談社. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003372347
                  327. Yue, Z., Zhuang, H., Bai, A., Hui, K., Jagerman, R., Zeng, H., Qin, Z., Wang, D., Wang, X., & Bendersky, M. (2024, October 4). Inference Scaling for Long-Context Retrieval Augmented Generation. https://openreview.net/forum?id=FSjIrOm1vz
                  328. Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33(4), 452–473. http://www.jstor.org/stable/3629752
                  329. Zamanou, S., & Glaser, S. R. (1994). Moving toward participation and involvement: Managing and measuring organizational culture. Group & Organization Management, 19(4), 475–502. https://doi.org/10.1177/1059601194194005
                  330. Zelikman, E., Harik, G. R., Shao, Y., Jayasiri, V., Haber, N., & Goodman, N. (2024, August 26). Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking. https://openreview.net/forum?id=oRXPiSOGH9
                  331. Zelikman, E., Wu, Y., Mu, J., & Goodman, N. (2022, October 31). STaR: bootstrapping reasoning with reasoning. https://openreview.net/forum?id=_3ELRdg2sgI
                  332. Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
                  333. 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.
                  334. 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.
                  335. Zhang, Y., & Teng, Z. (2021). Natural Language Processing: A Machine Learning Perspective. Cambridge University Press.
                  336. Zhang, B., & Sennrich, R. (2019). Root mean square layer normalization. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1110, pp. 12381–12392). Curran Associates Inc.
                  337. Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. https://doi.org/10.48550/arXiv.1611.03530
                  338. Zhao, X., Kang, Z., Feng, A., Levine, S., & Song, D. (2025). Learning to reason without external rewards. https://doi.org/10.48550/arXiv.2505.19590
                  339. Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao
                  340. Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he
                  341. Zhou, Z.-H. (2021). Machine Learning. Springer Singapore. https://doi.org/10.1007/978-981-15-1967-3
                  342. Zhou, C., Yu, L., Babu, A., Tirumala, K., Yasunaga, M., Shamis, L., Kahn, J., Ma, X., Zettlemoyer, L., & Levy, O. (2024, October 4). Transfusion: predict the next token and diffuse images with one multi-modal model. https://openreview.net/forum?id=SI2hI0frk6
                  343. 周志 华. (2022). 機械学習 (大和田勇 人, 玄光 男, 下川朝 有, & 郝新 厂, Trans.). 近代科学社.
                  344. 竹村 彰通. (2020). 新装改訂版 現代数理統計学. 学術図書出版社.
                  345. 著作権法第三十条第四項, (2019). https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_3-Ss_5-At_30_4
                  346. 著作権法第十条第三項第三号. Retrieved July 17, 2025, from https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_1
                  347. 著作権法の一部を改正する法律(平成30年法律第30号)について | 文化庁. Retrieved July 17, 2025, from https://www.bunka.go.jp/seisaku/chosakuken/hokaisei/h30_hokaisei/