講義資料
講義のスライドはこちら(閲覧のみ)
ソースコード一覧
参考文献一覧
※全てを片っ端から読もうとすることはお勧めしません(教科書的な文献を除く)。知りたい事柄に応じて必要な箇所を読みましょう。
AIと社会-AI関連法規・AI関連市場 (.bib)
- 100,000 H100 clusters: power, network topology, ethernet vs infiniband, reliability, failures, checkpointing. (2024). SemiAnalysis. https://semianalysis.com/2024/06/17/100000-h100-clusters-power-network/
- Bloomberg. (2024). Generative AI 2024 report: assessing opportunities and disruptions in an evolving trillion-dollar market. https://www.bloomberg.com/professional/products/bloomberg-terminal/research/bloomberg-intelligence/download/generative-ai-2024-report/
- Hu, K., & Hu, K. (2023). ChatGPT sets record for fastest-growing user base - analyst note. Reuters. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
- NVIDIA announces financial results for first quarter fiscal 2026. (2025). NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-first-quarter-fiscal-2026
- 人工知能関連技術の研究開発及び活用の推進に関する法律, (2025). https://laws.e-gov.go.jp/law/507AC0000000053
- 日本放送協会. (2025). 「性的ディープフェイク」で行政罰の条例改正案を可決 鳥取県. NHKニュース. https://www3.nhk.or.jp/news/html/20250630/k10014848661000.html
- 特許法. Retrieved July 18, 2025, from https://laws.e-gov.go.jp/law/334AC0000000121#Mp-Ch_1
- 特許庁. (2019). 特許・実用新案審査ハンドブック 附属書B 第1章 コンピュータソフトウエア関連発明. https://www.jpo.go.jp/system/laws/rule/guideline/patent/handbook_shinsa/document/index/app_b1.pdf
- 文化審議会著作権分科会法制度小委員会. (2024). AIと著作権に関する考え方について. https://www.bunka.go.jp/seisaku/bunkashingikai/chosakuken/pdf/94037901_01.pdf
- 著作権法第三十条第四項, (2019). https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_3-Ss_5-At_30_4
- 著作権法第十条第三項第三号. Retrieved July 17, 2025, from https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_1
- 著作権法の一部を改正する法律(平成30年法律第30号)について | 文化庁. Retrieved July 17, 2025, from https://www.bunka.go.jp/seisaku/chosakuken/hokaisei/h30_hokaisei/
AIと社会-倫理・安全・ガバナンス (.bib)
- Angwin, J., Larson, J., Mattu, S., Kirchner, L., & ProPublica. (2016). Machine Bias. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
- Chan, C., Ginosar, S., Zhou, T., & Efros, A. (2019). Everybody dance now. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 5932–5941. https://doi.org/10.1109/ICCV.2019.00603
- Chapagain, D., Kshetri, N., & Aryal, B. (2024). Deepfake disasters: a comprehensive review of technology, ethical concerns, countermeasures, and societal implications. 2024 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), 1–9. https://doi.org/10.1109/ETNCC63262.2024.10767452
- Chouldechova, A. (2017). Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data, 5(2), 153–163. https://doi.org/10.1089/big.2016.0047
- Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., & Huq, A. (2017). Algorithmic decision making and the cost of fairness. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 797–806. https://doi.org/10.1145/3097983.3098095
- delving_2025_july.png. (2025). GitHub. https://github.com/berenslab/llm-excess-vocab/blob/main/figures/post-publication-updates/delving_2025_july.png
- Fernando, T., Priyasad, D., Sridharan, S., Ross, A., & Fookes, C. (2025). Face deepfakes: a comprehensive review. https://doi.org/10.48550/arXiv.2502.09812
- 会計担当が38億円を詐欺グループに送金、ビデオ会議のCFOは偽物. (2024). CNN.co.jp. https://www.cnn.co.jp/world/35214839.html
- Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. In C. H. Papadimitriou (Ed.), 8th Innovations in Theoretical Computer Science Conference (Vol. 67, pp. 43:1–43:23). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Germany. https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2017.43
- Kobak, D., González-Márquez, R., Horvát, E.-Á., & Lause, J. (2025). Delving into LLM-assisted writing in biomedical publications through excess vocabulary. Science Advances, 11(27), eadt3813. https://doi.org/10.1126/sciadv.adt3813
- Li, P., Yang, J., Islam, M. A., & Ren, S. (2025). Making AI less ’thirsty.’ Commun. ACM, 68(7), 54–61. https://doi.org/10.1145/3724499
- Manzini, A., Keeling, G., Alberts, L., Vallor, S., Morris, M. R., & Gabriel, I. (2024). The code that binds us: navigating the appropriateness of human-AI assistant relationships. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 943–957. https://doi.org/10.1609/aies.v7i1.31694
- Marx, K. (1959). Economic & Philosophic Manuscripts of 1844 (M. Milligan, Tran.). Progress Publishers. https://www.marxists.org/archive/marx/works/1844/manuscripts/preface.htm
- Mata v. Avianca, Inc. (Number 1:22-cv-01461). (Number). District Court, S.D. New York. Retrieved July 13, 2025, from https://www.courtlistener.com/docket/63107798/mata-v-avianca-inc/
- Moffatt v. Air Canada. (2024). In CanLII (Vol. 149, Number SC-2023-005609). BCCRT. https://canlii.ca/t/k2spq
- OWASP. (2024). OWASP Top 10 for LLM applications & generative AI (Technical Report OWASP PDF v4.2.0a 20241114-202703; Number OWASP PDF v4.2.0a 20241114-202703).
- 生成AIでランサムウェアを作成した容疑者の摘発事例を考察. (2025). Trend Micro. https://www.trendmicro.com/ja_jp/jp-security/24/e/breaking-securitynews-20240529-02.html
- 生成AI悪用し楽天モバイルに不正アクセス、1000件以上の回線入手し転売か…容疑で中高生3人逮捕. (2025). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20250226-OYT1T50205/
- 生成AI悪用しウイルス作成、有罪判決…IT知識なくとも「1か月ぐらいで簡単に作れた」. (2024). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20241025-OYT1T50209/
- Storchan, V., Kumar, R., Chowdhury, R., Goldfarb-Tarrant, S., & Cattell, S. (2024). Generative AI red teaming challenge: transparency report [Technical Report]. DEF CON. https://drive.google.com/file/d/1JqpbIP6DNomkb32umLoiEPombK2-0Rc-/view
- Tabassi, E. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology. https://doi.org/10.6028/nist.ai.100-1
- Weidinger, L., Barnhart, J., Brennan, J., Butterfield, C., Young, S., Hawkins, W., Hendricks, L. A., Comanescu, R., Chang, O., Rodriguez, M., Beroshi, J., Bloxwich, D., Proleev, L., Chen, J., Farquhar, S., Ho, L., Gabriel, I., Dafoe, A., & Isaac, W. (2024). Holistic safety and responsibility evaluations of advanced AI models. https://doi.org/10.48550/arXiv.2404.14068
- Writers Guild of America. (2023). Summary of the 2023 WGA MBA. WGA Contract 2023. https://www.wgacontract2023.org/the-campaign/summary-of-the-2023-wga-mba
AIと社会-実社会応用の事例-分類 (.bib)
AIと社会-実社会応用の事例-回帰 (.bib)
- DeepETA: How Uber Predicts Arrival Times Using Deep Learning. (2022). Uber Blog. https://www.uber.com/blog/deepeta-how-uber-predicts-arrival-times/
- GO株式会社. (2023). タクシーアプリ『GO』の データ基盤の全体像. https://www.slideshare.net/slideshow/ss-258369181/258369181
- Ye, P., Qian, J., Chen, J., Wu, C.-hung, Zhou, Y., De Mars, S., Yang, F., & Zhang, L. (2018). Customized Regression Model for Airbnb Dynamic Pricing. 932–940. https://doi.org/10.1145/3219819.3219830
AIと社会-実社会応用の事例 (.bib)
- DeepETA: How Uber Predicts Arrival Times Using Deep Learning. (2022). Uber Blog. https://www.uber.com/blog/deepeta-how-uber-predicts-arrival-times/
- GO株式会社. (2023). タクシーアプリ『GO』の データ基盤の全体像. https://www.slideshare.net/slideshow/ss-258369181/258369181
- Ye, P., Qian, J., Chen, J., Wu, C.-hung, Zhou, Y., De Mars, S., Yang, F., & Zhang, L. (2018). Customized Regression Model for Airbnb Dynamic Pricing. 932–940. https://doi.org/10.1145/3219819.3219830
AIと社会-法律・AI関連市場 (.bib)
AIと社会-生産性と職務の変化 (.bib)
- Cui, Z. (K., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The effects of generative AI on high-skilled work: evidence from three field experiments with software developers (Number 4945566) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4945566
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models. https://doi.org/10.48550/arXiv.2303.10130
- Filippucci, F., Gal, P., Jona-Lasinio, C., Leandro, A., & Nicoletti, G. (2024). The impact of artificial intelligence on productivity, distribution and growth: key mechanisms, initial evidence and policy challenges (No.15; OECD Artificial Intelligence Papers, Number 15). OECD Publishing. https://doi.org/10.1787/8d900037-en
- Filippucci, F., Gal, P., & Schief, M. (2024). Miracle or myth? Assessing the macroeconomic productivity gains from artificial intelligence (No.29; OECD Artificial Intelligence Papers, Number 29). OECD Publishing. https://doi.org/10.1787/b524a072-en
- Handa, K., Tamkin, A., McCain, M., Huang, S., Durmus, E., Heck, S., Mueller, J., Hong, J., Ritchie, S., Belonax, T., Troy, K. K., Amodei, D., Kaplan, J., Clark, J., & Ganguli, D. (2025). Which economic tasks are performed with AI? Evidence from millions of Claude conversations. https://doi.org/10.48550/arXiv.2503.04761
AIと社会-生産性と職業への影響 (.bib)
- Cui, Z. (K., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The effects of generative ai on high-skilled work: evidence from three field experiments with software developers (Number 4945566) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4945566
AIと社会-生産性・職務変化・雇用市場 (.bib)
- Brynjolfsson, E., Li, D., & Raymond, L. R. (2023). Generative AI at work (Number 31161) [Working Paper]. https://doi.org/10.3386/w31161
- Brynjolfsson, E., Mitchell, T., & Rock, D. (2018). What can machines learn, and what does it mean for occupations and the economy? AEA Papers and Proceedings, 108, 43–47. https://doi.org/10.1257/pandp.20181019
- Cazzaniga, M. (2024). Gen-AI: artificial intelligence and the future of work. Staff Discussion Notes, 2024(001), 1. https://doi.org/10.5089/9798400262548.006
- Cui, Z. (K., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The effects of generative AI on high-skilled work: evidence from three field experiments with software developers (Number 4945566) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4945566
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: an early look at the labor market impact potential of large language models. https://doi.org/10.48550/arXiv.2303.10130
- Felten, E. W., Raj, M., & Seamans, R. (2023). How will Language Modelers like ChatGPT Affect Occupations and Industries? (Number 4375268) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4375268
- Felten, E. W., Raj, M., & Seamans, R. (2018). A method to link advances in artificial intelligence to occupational abilities. AEA Papers and Proceedings, 108, 54–57. https://doi.org/10.1257/pandp.20181021
- Felten, E. W., Raj, M., & Seamans, R. (2023). Occupational heterogeneity in exposure to Generative AI (Number 4414065) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4414065
- Filippucci, F., Gal, P., Jona-Lasinio, C., Leandro, A., & Nicoletti, G. (2024). The impact of artificial intelligence on productivity, distribution and growth: key mechanisms, initial evidence and policy challenges (No.15; OECD Artificial Intelligence Papers, Number 15). OECD Publishing. https://doi.org/10.1787/8d900037-en
- Filippucci, F., Gal, P., & Schief, M. (2024). Miracle or myth? Assessing the macroeconomic productivity gains from artificial intelligence (No.29; OECD Artificial Intelligence Papers, Number 29). OECD Publishing. https://doi.org/10.1787/b524a072-en
- Handa, K., Tamkin, A., McCain, M., Huang, S., Durmus, E., Heck, S., Mueller, J., Hong, J., Ritchie, S., Belonax, T., Troy, K. K., Amodei, D., Kaplan, J., Clark, J., & Ganguli, D. (2025). Which economic tasks are performed with AI? Evidence from millions of Claude conversations. https://doi.org/10.48550/arXiv.2503.04761
- Lane, M. (2024). Who will be the workers most affected by AI? A closer look at the impact of AI on women, low-skilled workers and other groups (Technical Report No.26; Number 26). OECD Publishing. https://doi.org/10.1787/14dc6f89-en
- Lipsey, R. G., Carlaw, K. I., & Bekar, C. T. (2005). Economic Transformations: General Purpose Technologies and Long-Term Economic Growth. Oxford University Press. https://doi.org/10.1093/oso/9780199285648.001.0001
- Mäkelä, E., & Stephany, F. (2025). Complement or substitute? How AI increases the demand for human skills (Number 5153230) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.5153230
- Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654), 187–192. https://doi.org/10.1126/science.adh2586
- Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. (2023). The impact of AI on developer productivity: evidence from GitHub Copilot. https://doi.org/10.48550/arXiv.2302.06590
- Webb, M. (2019). The Impact of Artificial Intelligence on the Labor Market (Number 3482150) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.3482150
- World Economic Forum. (2025). Future of Jobs Report 2025 [Insight Report]. https://www.weforum.org/publications/the-future-of-jobs-report-2025/digest/
- 尭之 新田. 生成AIが描く日本の職業の明暗とその対応策.
AIと社会 (.bib)
- 100,000 H100 clusters: power, network topology, ethernet vs infiniband, reliability, failures, checkpointing. (2024). SemiAnalysis. https://semianalysis.com/2024/06/17/100000-h100-clusters-power-network/
- Angwin, J., Larson, J., Mattu, S., Kirchner, L., & ProPublica. (2016). Machine Bias. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
- Bloomberg. (2024). Generative AI 2024 report: assessing opportunities and disruptions in an evolving trillion-dollar market. https://www.bloomberg.com/professional/products/bloomberg-terminal/research/bloomberg-intelligence/download/generative-ai-2024-report/
- Brynjolfsson, E., Li, D., & Raymond, L. R. (2023). Generative AI at work (Number 31161) [Working Paper]. https://doi.org/10.3386/w31161
- Brynjolfsson, E., Mitchell, T., & Rock, D. (2018). What can machines learn, and what does it mean for occupations and the economy? AEA Papers and Proceedings, 108, 43–47. https://doi.org/10.1257/pandp.20181019
- Cazzaniga, M. (2024). Gen-AI: artificial intelligence and the future of work. Staff Discussion Notes, 2024(001), 1. https://doi.org/10.5089/9798400262548.006
- Chan, C., Ginosar, S., Zhou, T., & Efros, A. (2019). Everybody dance now. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 5932–5941. https://doi.org/10.1109/ICCV.2019.00603
- Chapagain, D., Kshetri, N., & Aryal, B. (2024). Deepfake disasters: a comprehensive review of technology, ethical concerns, countermeasures, and societal implications. 2024 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), 1–9. https://doi.org/10.1109/ETNCC63262.2024.10767452
- Chouldechova, A. (2017). Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data, 5(2), 153–163. https://doi.org/10.1089/big.2016.0047
- Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., & Huq, A. (2017). Algorithmic decision making and the cost of fairness. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 797–806. https://doi.org/10.1145/3097983.3098095
- Cui, Z. (K., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The effects of generative AI on high-skilled work: evidence from three field experiments with software developers (Number 4945566) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4945566
- DeepETA: How Uber Predicts Arrival Times Using Deep Learning. (2022). Uber Blog. https://www.uber.com/blog/deepeta-how-uber-predicts-arrival-times/
- delving_2025_july.png. (2025). GitHub. https://github.com/berenslab/llm-excess-vocab/blob/main/figures/post-publication-updates/delving_2025_july.png
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: an early look at the labor market impact potential of large language models. https://doi.org/10.48550/arXiv.2303.10130
- Felten, E. W., Raj, M., & Seamans, R. (2023). How will Language Modelers like ChatGPT Affect Occupations and Industries? (Number 4375268) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4375268
- Felten, E. W., Raj, M., & Seamans, R. (2018). A method to link advances in artificial intelligence to occupational abilities. AEA Papers and Proceedings, 108, 54–57. https://doi.org/10.1257/pandp.20181021
- Felten, E. W., Raj, M., & Seamans, R. (2023). Occupational heterogeneity in exposure to Generative AI (Number 4414065) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4414065
- Fernando, T., Priyasad, D., Sridharan, S., Ross, A., & Fookes, C. (2025). Face deepfakes: a comprehensive review. https://doi.org/10.48550/arXiv.2502.09812
- Filippucci, F., Gal, P., Jona-Lasinio, C., Leandro, A., & Nicoletti, G. (2024). The impact of artificial intelligence on productivity, distribution and growth: key mechanisms, initial evidence and policy challenges (No.15; OECD Artificial Intelligence Papers, Number 15). OECD Publishing. https://doi.org/10.1787/8d900037-en
- Filippucci, F., Gal, P., & Schief, M. (2024). Miracle or myth? Assessing the macroeconomic productivity gains from artificial intelligence (No.29; OECD Artificial Intelligence Papers, Number 29). OECD Publishing. https://doi.org/10.1787/b524a072-en
- GO株式会社. (2023). タクシーアプリ『GO』の データ基盤の全体像. https://www.slideshare.net/slideshow/ss-258369181/258369181
- Handa, K., Tamkin, A., McCain, M., Huang, S., Durmus, E., Heck, S., Mueller, J., Hong, J., Ritchie, S., Belonax, T., Troy, K. K., Amodei, D., Kaplan, J., Clark, J., & Ganguli, D. (2025). Which economic tasks are performed with AI? Evidence from millions of Claude conversations. https://doi.org/10.48550/arXiv.2503.04761
- Hu, K., & Hu, K. (2023). ChatGPT sets record for fastest-growing user base - analyst note. Reuters. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
- 会計担当が38億円を詐欺グループに送金、ビデオ会議のCFOは偽物. (2024). CNN.co.jp. https://www.cnn.co.jp/world/35214839.html
- Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. In C. H. Papadimitriou (Ed.), 8th Innovations in Theoretical Computer Science Conference (Vol. 67, pp. 43:1–43:23). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Germany. https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2017.43
- Kobak, D., González-Márquez, R., Horvát, E.-Á., & Lause, J. (2025). Delving into LLM-assisted writing in biomedical publications through excess vocabulary. Science Advances, 11(27), eadt3813. https://doi.org/10.1126/sciadv.adt3813
- Lane, M. (2024). Who will be the workers most affected by AI? A closer look at the impact of AI on women, low-skilled workers and other groups (Technical Report No.26; Number 26). OECD Publishing. https://doi.org/10.1787/14dc6f89-en
- Li, P., Yang, J., Islam, M. A., & Ren, S. (2025). Making AI less ’thirsty.’ Commun. ACM, 68(7), 54–61. https://doi.org/10.1145/3724499
- Lipsey, R. G., Carlaw, K. I., & Bekar, C. T. (2005). Economic Transformations: General Purpose Technologies and Long-Term Economic Growth. Oxford University Press. https://doi.org/10.1093/oso/9780199285648.001.0001
- Mäkelä, E., & Stephany, F. (2025). Complement or substitute? How AI increases the demand for human skills (Number 5153230) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.5153230
- Manzini, A., Keeling, G., Alberts, L., Vallor, S., Morris, M. R., & Gabriel, I. (2024). The code that binds us: navigating the appropriateness of human-AI assistant relationships. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 943–957. https://doi.org/10.1609/aies.v7i1.31694
- Marx, K. (1959). Economic & Philosophic Manuscripts of 1844 (M. Milligan, Tran.). Progress Publishers. https://www.marxists.org/archive/marx/works/1844/manuscripts/preface.htm
- Mata v. Avianca, Inc. (Number 1:22-cv-01461). (Number). District Court, S.D. New York. Retrieved July 13, 2025, from https://www.courtlistener.com/docket/63107798/mata-v-avianca-inc/
- Moffatt v. Air Canada. (2024). In CanLII (Vol. 149, Number SC-2023-005609). BCCRT. https://canlii.ca/t/k2spq
- Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654), 187–192. https://doi.org/10.1126/science.adh2586
- NVIDIA announces financial results for first quarter fiscal 2026. (2025). NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-first-quarter-fiscal-2026
- OWASP. (2024). OWASP Top 10 for LLM applications & generative AI (Technical Report OWASP PDF v4.2.0a 20241114-202703; Number OWASP PDF v4.2.0a 20241114-202703).
- Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. (2023). The impact of AI on developer productivity: evidence from GitHub Copilot. https://doi.org/10.48550/arXiv.2302.06590
- 人工知能関連技術の研究開発及び活用の推進に関する法律, (2025). https://laws.e-gov.go.jp/law/507AC0000000053
- 日本放送協会. (2025). 「性的ディープフェイク」で行政罰の条例改正案を可決 鳥取県. NHKニュース. https://www3.nhk.or.jp/news/html/20250630/k10014848661000.html
- 生成AIでランサムウェアを作成した容疑者の摘発事例を考察. (2025). Trend Micro. https://www.trendmicro.com/ja_jp/jp-security/24/e/breaking-securitynews-20240529-02.html
- 生成AI悪用し楽天モバイルに不正アクセス、1000件以上の回線入手し転売か…容疑で中高生3人逮捕. (2025). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20250226-OYT1T50205/
- 生成AI悪用しウイルス作成、有罪判決…IT知識なくとも「1か月ぐらいで簡単に作れた」. (2024). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20241025-OYT1T50209/
- Storchan, V., Kumar, R., Chowdhury, R., Goldfarb-Tarrant, S., & Cattell, S. (2024). Generative AI red teaming challenge: transparency report [Technical Report]. DEF CON. https://drive.google.com/file/d/1JqpbIP6DNomkb32umLoiEPombK2-0Rc-/view
- Tabassi, E. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology. https://doi.org/10.6028/nist.ai.100-1
- Tamkin, A., McCain, M., Handa, K., Durmus, E., Lovitt, L., Rathi, A., Huang, S., Mountfield, A., Hong, J., Ritchie, S., Stern, M., Clarke, B., Goldberg, L., Sumers, T. R., Mueller, J., McEachen, W., Mitchell, W., Carter, S., Clark, J., … Ganguli, D. (2024). Clio: privacy-preserving insights into real-world ai use. https://doi.org/10.48550/arXiv.2412.13678
- 特許法. Retrieved July 18, 2025, from https://laws.e-gov.go.jp/law/334AC0000000121#Mp-Ch_1
- 特許庁. (2019). 特許・実用新案審査ハンドブック 附属書B 第1章 コンピュータソフトウエア関連発明. https://www.jpo.go.jp/system/laws/rule/guideline/patent/handbook_shinsa/document/index/app_b1.pdf
- Webb, M. (2019). The Impact of Artificial Intelligence on the Labor Market (Number 3482150) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.3482150
- Weidinger, L., Barnhart, J., Brennan, J., Butterfield, C., Young, S., Hawkins, W., Hendricks, L. A., Comanescu, R., Chang, O., Rodriguez, M., Beroshi, J., Bloxwich, D., Proleev, L., Chen, J., Farquhar, S., Ho, L., Gabriel, I., Dafoe, A., & Isaac, W. (2024). Holistic safety and responsibility evaluations of advanced AI models. https://doi.org/10.48550/arXiv.2404.14068
- 文化審議会著作権分科会法制度小委員会. (2024). AIと著作権に関する考え方について. https://www.bunka.go.jp/seisaku/bunkashingikai/chosakuken/pdf/94037901_01.pdf
- World Economic Forum. (2025). Future of Jobs Report 2025 [Insight Report]. https://www.weforum.org/publications/the-future-of-jobs-report-2025/digest/
- Writers Guild of America. (2023). Summary of the 2023 WGA MBA. WGA Contract 2023. https://www.wgacontract2023.org/the-campaign/summary-of-the-2023-wga-mba
- 尭之 新田. 生成AIが描く日本の職業の明暗とその対応策.
- Ye, P., Qian, J., Chen, J., Wu, C.-hung, Zhou, Y., De Mars, S., Yang, F., & Zhang, L. (2018). Customized Regression Model for Airbnb Dynamic Pricing. 932–940. https://doi.org/10.1145/3219819.3219830
- 著作権法第三十条第四項, (2019). https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_3-Ss_5-At_30_4
- 著作権法第十条第三項第三号. Retrieved July 17, 2025, from https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_1
- 著作権法の一部を改正する法律(平成30年法律第30号)について | 文化庁. Retrieved July 17, 2025, from https://www.bunka.go.jp/seisaku/chosakuken/hokaisei/h30_hokaisei/
イントロダクション (.bib)
- 人工知能学会. (2023年5月版). AIマップβ2.0. https://www.ai-gakkai.or.jp/aimap/
- 神嶌 敏弘. (2019). 変わりゆく機械学習と変わらない機械学習. 日本物理学会誌, 74(1), 5–13. https://doi.org/10.11316/butsuri.74.1_5
データサイエンス-中間解析 (.bib)
- O’Brien, P. C., & Fleming, T. R. (1979). A multiple testing procedure for clinical trials. Biometrics, 35(3), 549–556. https://doi.org/10.2307/2530245
- Proschan, M. A., Lan, K. K. G., & Wittes, J. T. (2006). Statistical Monitoring of Clinical Trials: A Unified Approach. Springer.
データサイエンス (.bib)
- 北川 源四郎, 竹村 彰通, 赤穂 昭太郎, 今泉 允聡, 内田 誠一, 清 智也, 高野 渉, 辻 真吾, 原 尚幸, 久野 遼平, 松原 仁, 宮地 充子, 森畑 明昌, & 宿久 洋. (2023). 応用基礎としてのデータサイエンス AI×データ活用の実践. 講談社.
- O’Brien, P. C., & Fleming, T. R. (1979). A multiple testing procedure for clinical trials. Biometrics, 35(3), 549–556. https://doi.org/10.2307/2530245
- Proschan, M. A., Lan, K. K. G., & Wittes, J. T. (2006). Statistical Monitoring of Clinical Trials: A Unified Approach. Springer.
データセット (.bib)
- Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 1, 886–893 vol. 1. https://doi.org/10.1109/CVPR.2005.177
- De Cock, D. (2011). Ames, Iowa: alternative to the Boston housing data as an end of semester regression project. Journal of Statistics Education, 19(3), 8. https://doi.org/10.1080/10691898.2011.11889627
- Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. https://doi.org/10.1109/CVPR.2009.5206848
- Forina, M., Armanino, C., Castino, M., & Ubigli, M. (1986). Multivariate data analysis as a discriminating method of the origin of wines. VITIS - Journal of Grapevine Research, 25(3), 189–189. https://doi.org/10.5073/vitis.1986.25.189-201
- ndl-lab/pdmocrdataset-part1. (2024). ndl-lab. https://github.com/ndl-lab/pdmocrdataset-part1
- Penedo, G., Malartic, Q., Hesslow, D., Cojocaru, R., Cappelli, A., Alobeidli, H., Pannier, B., Almazrouei, E., & Launay, J. (2023). The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. https://doi.org/10.48550/arXiv.2306.01116
- Taori, R., Gulrajani, I., Zhang, T., Dubois, Y., Li, X., Guestrin, C., Liang, P., & Hashimoto, T. B. (2023). Stanford Alpaca: An Instruction-following LLaMA model [Data set]. stanford_alpaca.
- workpiles. (2016). CUCUMBER-9. https://github.com/workpiles/CUCUMBER-9
データマイニング-グラフマイニング (.bib)
- 石黒 勝彦, & 林 浩平. (2016). 関係データ学習. 講談社.
- 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.
データマイニング-音声データマイニング (.bib)
データマイニング (.bib)
- Rokach, L., & Maimon, O. (2014). Data Mining with Decision Trees: Theory and Applications (2nd ed., Vol. 81). WORLD SCIENTIFIC. https://doi.org/10.1142/9097
- 石黒 勝彦, & 林 浩平. (2016). 関係データ学習. 講談社.
- 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.
数学-情報理論 (.bib)
- Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory (2nd ed). Wiley-Interscience.
数学-数理最適化 (.bib)
- Defazio, A., Yang, X. A., Khaled, A., Mishchenko, K., Mehta, H., & Cutkosky, A. (2024, November 6). The road less scheduled. https://openreview.net/forum?id=0XeNkkENuI
- Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazaré, P.-E., Lomeli, M., Hosseini, L., & Jégou, H. (2025). The Faiss library. https://doi.org/10.48550/arXiv.2401.08281
- Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., & Wilson, A. G. (2018). Averaging weights leads to wider optima and better generalization. Proceedings of the Conference on Uncertainty in Artificial Intelligence.
- 金森 敬文, 鈴木 大慈, 竹内 一郎, & 佐藤 一誠. (2016). 機械学習のための連続最適化. 講談社.
- Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. Proceedings of 3rd International Conference for Learning Representations. http://arxiv.org/abs/1412.6980
- 鈴木 大慈. (2015). 確率的最適化. 講談社.
- 梅谷 俊治. (2020). しっかり学ぶ数理最適化 モデルからアルゴリズムまで. 講談社.
- 山下 信雄. (2015). 非線形計画法. 朝倉書店. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003278170
- Takami Sato. (2024). 最適化超入門. https://speakerdeck.com/tkm2261/zui-shi-hua-chao-ru-men
- 穴井 宏和, & 斉藤 努. (2015). 今日から使える!組合せ最適化 離散問題ガイドブック. 講談社.
- 穴井 宏和. (2013). 数理最適化の実践ガイド. 講談社.
数学-確率統計 (.bib)
- Çinlar, E. (2011). Probability and Stochastics. Springer New York. https://doi.org/10.1007/978-0-387-87859-1
- Hall, P., Marron, J. S., & Neeman, A. (2005). Geometric representation of high dimension, low sample size data. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(3), 427–444. https://doi.org/10.1111/j.1467-9868.2005.00510.x
- Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
- Vershynin, R. (2018). High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press.
- Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint (1st ed.). Cambridge University Press. https://www.cambridge.org/core/product/identifier/9781108627771/type/book
- 竹村 彰通. (2020). 新装改訂版 現代数理統計学. 学術図書出版社.
数学-線型代数・関数解析 (.bib)
- 赤穂 昭太郎. (2008). カーネル多変量解析―非線形データ解析の新しい展開. 岩波書店.
- 村上 正康, 稲葉 尚志, & 野沢 宗平. (1989). 演習 線形代数 (改訂版). 培風館.
- 福水 健次. (2010). カーネル法入門―正定値カーネルによるデータ解析. 朝倉書店.
- 黒田 成俊. (1980). 関数解析 (Number 15). 共立出版.
- Luenberger, D. G. (1997). Optimization by Vector Space Methods (1st ed.). John Wiley & Sons, Inc.
- Petersen, K. B., & Pedersen, M. S. (2012/nov). The Matrix Cookbook. Technical University of Denmark. http://www2.compute.dtu.dk/pubdb/pubs/3274-full.html
数学-高次元現象 (.bib)
- Aggarwal, C. C., Hinneburg, A., & Keim, D. A. (2001). On the surprising behavior of distance metrics in high dimensional spaces. Proceedings of the 8th International Conference on Database Theory, 420–434.
数学 (.bib)
- Aggarwal, C. C., Hinneburg, A., & Keim, D. A. (2001). On the surprising behavior of distance metrics in high dimensional spaces. Proceedings of the 8th International Conference on Database Theory, 420–434.
- 赤穂 昭太郎. (2008). カーネル多変量解析―非線形データ解析の新しい展開. 岩波書店.
- Çinlar, E. (2011). Probability and Stochastics. Springer New York. https://doi.org/10.1007/978-0-387-87859-1
- Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory (2nd ed). Wiley-Interscience.
- 村上 正康, 稲葉 尚志, & 野沢 宗平. (1989). 演習 線形代数 (改訂版). 培風館.
- Defazio, A., Yang, X. A., Khaled, A., Mishchenko, K., Mehta, H., & Cutkosky, A. (2024, November 6). The road less scheduled. https://openreview.net/forum?id=0XeNkkENuI
- Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazaré, P.-E., Lomeli, M., Hosseini, L., & Jégou, H. (2025). The Faiss library. https://doi.org/10.48550/arXiv.2401.08281
- 福水 健次. (2010). カーネル法入門―正定値カーネルによるデータ解析. 朝倉書店.
- Hall, P., Marron, J. S., & Neeman, A. (2005). Geometric representation of high dimension, low sample size data. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(3), 427–444. https://doi.org/10.1111/j.1467-9868.2005.00510.x
- 黒田 成俊. (1980). 関数解析 (Number 15). 共立出版.
- Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., & Wilson, A. G. (2018). Averaging weights leads to wider optima and better generalization. Proceedings of the Conference on Uncertainty in Artificial Intelligence.
- 金森 敬文, 鈴木 大慈, 竹内 一郎, & 佐藤 一誠. (2016). 機械学習のための連続最適化. 講談社.
- Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. Proceedings of 3rd International Conference for Learning Representations. http://arxiv.org/abs/1412.6980
- 鈴木 大慈. (2015). 確率的最適化. 講談社.
- Luenberger, D. G. (1997). Optimization by Vector Space Methods (1st ed.). John Wiley & Sons, Inc.
- 梅谷 俊治. (2020). しっかり学ぶ数理最適化 モデルからアルゴリズムまで. 講談社.
- Petersen, K. B., & Pedersen, M. S. (2012/nov). The Matrix Cookbook. Technical University of Denmark. http://www2.compute.dtu.dk/pubdb/pubs/3274-full.html
- 山下 信雄. (2015). 非線形計画法. 朝倉書店. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003278170
- Takami Sato. (2024). 最適化超入門. https://speakerdeck.com/tkm2261/zui-shi-hua-chao-ru-men
- Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
- Vershynin, R. (2018). High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press.
- Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint (1st ed.). Cambridge University Press. https://www.cambridge.org/core/product/identifier/9781108627771/type/book
- 穴井 宏和, & 斉藤 努. (2015). 今日から使える!組合せ最適化 離散問題ガイドブック. 講談社.
- 穴井 宏和. (2013). 数理最適化の実践ガイド. 講談社.
- 竹村 彰通. (2020). 新装改訂版 現代数理統計学. 学術図書出版社.
機械学習-Bayesian methods-Gaussian processes (.bib)
- Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
機械学習-Bayesian methods-Stochastic Gradient Langevin Dynamics (.bib)
- Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.
機械学習-Bayesian methods-Variational inference (.bib)
- Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
- Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
機械学習-Bayesian methods-全般 (.bib)
- Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
- 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
機械学習-Bayesian methods (.bib)
- Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
- 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
- Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
- Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
- Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
- Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.
機械学習-Conformal prediction (.bib)
- Angelopoulos, A. N., & Bates, S. (2022). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. http://arxiv.org/abs/2107.07511
- Vovk, V., Gammerman, A., & Shafer, G. (2022). Algorithmic Learning in a Random World. Springer International Publishing. https://doi.org/10.1007/978-3-031-06649-8
機械学習-Hyperparameter tuning (.bib)
- Bergstra, J., & Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13.
機械学習-LLM-2024 (.bib)
機械学習-LLM-2025 (.bib)
- Gemini Team, Google. (2025). Gemini 2.5: pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities. https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf
機械学習-LLM-FineTuning (.bib)
- Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU
機械学習-LLM-LLM-as-a-judge (.bib)
- Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
- Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
- Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao
機械学習-LLM-MoE (.bib)
- Fedus, W., Zoph, B., & Shazeer, N. (2022). Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23(1), 120:5232–5120:5270.
機械学習-LLM-RLHF (.bib)
- Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
機械学習-LLM-Training (.bib)
- The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
- Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
- Team, K., Du, A., Gao, B., Xing, B., Jiang, C., Chen, C., Li, C., Xiao, C., Du, C., Liao, C., Tang, C., Wang, C., Zhang, D., Yuan, E., Lu, E., Tang, F., Sung, F., Wei, G., Lai, G., … Lin, Z. (2025). Kimi k1.5: Scaling Reinforcement Learning with LLMs. arXiv.org. https://arxiv.org/abs/2501.12599v4
機械学習-LLM-アラインメント (.bib)
- 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
- Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
- Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
- Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
- Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
- Guan, M. Y., Joglekar, M., Wallace, E., Jain, S., Barak, B., Helyar, A., Dias, R., Vallone, A., Ren, H., Wei, J., Chung, H. W., Toyer, S., Heidecke, J., Beutel, A., & Glaese, A. (2025). Deliberative alignment: reasoning enables safer language models. https://doi.org/10.48550/arXiv.2412.16339
- John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
- Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
- Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
- OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
- Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
- Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
- Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
- Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
- Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
- Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
- 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
- Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he
機械学習-LLM-エージェント (.bib)
- 7群7編 分散協調とエージェント – 電子情報通信学会知識ベース. Retrieved July 5, 2025, from https://www.ieice-hbkb.org/portal/7%e7%be%a4%e3%80%80%e3%82%b3%e3%83%b3%e3%83%94%e3%83%a5%e3%83%bc%e3%82%bf-%ef%bd%bf%ef%be%8c%ef%be%84%ef%bd%b3%ef%bd%aa%ef%bd%b1/7%e7%be%a47%e7%b7%a8-%e5%88%86%e6%95%a3%e5%8d%94%e8%aa%bf%e3%81%a8%e3%82%a8%e3%83%bc%e3%82%b8%e3%82%a7%e3%83%b3%e3%83%88/
- 電子情報通信学会. (2019). 7群-7編-1章 エージェントの定義・モデル・概念. In 電子情報通信学会知識ベース (ver.1 ed., Number 7群7編). https://www.ieice-hbkb.org/files/ad_base/view_pdf.html?p=/files/07/07gun_07hen_01.pdf
- A generalist agent. (2022). Transactions on Machine Learning Research. https://openreview.net/forum?id=1ikK0kHjvj
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
機械学習-LLM-スケーリング則 (.bib)
- Henighan, T., Kaplan, J., Katz, M., Chen, M., Hesse, C., Jackson, J., Jun, H., Brown, T. B., Dhariwal, P., Gray, S., Hallacy, C., Mann, B., Radford, A., Ramesh, A., Ryder, N., Ziegler, D. M., Schulman, J., Amodei, D., & McCandlish, S. (2020). Scaling laws for autoregressive generative modeling. https://doi.org/10.48550/arXiv.2010.14701
- Hernandez, D., Kaplan, J., Henighan, T., & McCandlish, S. (2021). Scaling laws for transfer. https://doi.org/10.48550/arXiv.2102.01293
- Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
- Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
- Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
- Muennighoff, N., Rush, A. M., Barak, B., Scao, T. L., Tazi, N., Piktus, A., Pyysalo, S., Wolf, T., & Raffel, C. (2023, November 2). Scaling data-constrained language models. https://openreview.net/forum?id=j5BuTrEj35
- Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
- Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
- Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
- Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
- Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/
機械学習-LLM-プロンプティング (.bib)
- Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
- Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.
機械学習-LLM-事後学習-RL (.bib)
- Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., … Kaplan, J. (2022). Constitutional AI: harmlessness from AI feedback. https://doi.org/10.48550/arXiv.2212.08073
- Khan, F. (2025). FareedKhan-dev/train-deepseek-r1. https://github.com/FareedKhan-dev/train-deepseek-r1
- Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
- Wang, Y., Yang, Q., Zeng, Z., Ren, L., Liu, L., Peng, B., Cheng, H., He, X., Wang, K., Gao, J., Chen, W., Wang, S., Du, S. S., & Shen, Y. (2025). Reinforcement learning for reasoning in large language models with one training example. https://doi.org/10.48550/arXiv.2504.20571
- Wen, X., Liu, Z., Zheng, S., Xu, Z., Ye, S., Wu, Z., Liang, X., Wang, Y., Li, J., Miao, Z., Bian, J., & Yang, M. (2025). Reinforcement learning with verifiable rewards implicitly incentivizes correct reasoning in base LLMs. https://doi.org/10.48550/arXiv.2506.14245
- Zhao, X., Kang, Z., Feng, A., Levine, S., & Song, D. (2025). Learning to reason without external rewards. https://doi.org/10.48550/arXiv.2505.19590
機械学習-LLM-事後学習-SFT (.bib)
機械学習-LLM-事後学習-全般 (.bib)
- Lambert, N., Morrison, J., Pyatkin, V., Huang, S., Ivison, H., Brahman, F., Miranda, L. J. V., Liu, A., Dziri, N., Lyu, S., Gu, Y., Malik, S., Graf, V., Hwang, J. D., Yang, J., Bras, R. L., Tafjord, O., Wilhelm, C., Soldaini, L., … Hajishirzi, H. (2025). Tulu 3: Pushing Frontiers in Open Language Model Post-Training. https://doi.org/10.48550/arXiv.2411.15124
機械学習-LLM-事後学習-選好チューニング (.bib)
- The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
- Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
機械学習-LLM-全般 (.bib)
- Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
- An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
- Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2019, September 25). The curious case of neural text degeneration. https://openreview.net/forum?id=rygGQyrFvH
- 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
- 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
機械学習-LLM-思考モデル (.bib)
- DeepSeek-AI, Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z. F., Gou, Z., Shao, Z., Li, Z., Gao, Z., … Zhang, Z. (2025). DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. https://doi.org/10.48550/arXiv.2501.12948
- Guan, M. Y., Joglekar, M., Wallace, E., Jain, S., Barak, B., Helyar, A., Dias, R., Vallone, A., Ren, H., Wei, J., Chung, H. W., Toyer, S., Heidecke, J., Beutel, A., & Glaese, A. (2025). Deliberative alignment: reasoning enables safer language models. https://doi.org/10.48550/arXiv.2412.16339
- Jin, M., Yu, Q., Shu, D., Zhao, H., Hua, W., Meng, Y., Zhang, Y., & Du, M. (2024). The impact of reasoning step length on large language models (L.-W. Ku, A. Martins, & V. Srikumar, Eds.; pp. 1830–1842). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.108
- Lee, K.-H., Fischer, I., Wu, Y.-H., Marwood, D., Baluja, S., Schuurmans, D., & Chen, X. (2025). Evolving deeper LLM thinking. https://doi.org/10.48550/arXiv.2501.09891
- Liu, R., Gao, J., Zhao, J., Zhang, K., Li, X., Qi, B., Ouyang, W., & Zhou, B. (2025). Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling. https://doi.org/10.48550/arXiv.2502.06703
- Muennighoff, N., Yang, Z., Shi, W., Li, X. L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E., & Hashimoto, T. (2025). s1: Simple test-time scaling. https://doi.org/10.48550/arXiv.2501.19393
- OpenAI. (2024). Learning to reason with LLMs. https://openai.com/index/learning-to-reason-with-llms/
- Rafailov, R., Hejna, J., Park, R., & Finn, C. (2024, August 26). From \r to {Q^*\: Your Language Model is Secretly a Q-Function. https://openreview.net/forum?id=kEVcNxtqXk
- Sardana, N., Portes, J., Doubov, S., & Frankle, J. (2024). Beyond Chinchilla-optimal: accounting for inference in language model scaling laws. Proceedings of the 41st International Conference on Machine Learning, 235, 43445–43460.
- Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. R., & Yao, S. (2023, November 2). Reflexion: language agents with verbal reinforcement learning. https://openreview.net/forum?id=vAElhFcKW6
- Singhal, S., Zeng, J., Bukharin, A., Zhang, Y., Shen, G., Mahabaleshwarkar, A. S., Kartal, B., Suhara, Y., Bercovich, A., Levy, I., Golan, I., Dabbah, M., El-Yaniv, R., Majumdar, S., Gitman, I., Bakhturina, E., Zhang, J. J., Su, B.-Y., Huang, G., … Konuk, T. (2025, June 21). Llama-Nemotron: Efficient Reasoning Models. https://openreview.net/forum?id=ev1xpo9mbI&referrer=%5Bthe%20profile%20of%20Olivier%20Delalleau%5D(%2Fprofile%3Fid%3D Olivier_Delalleau1)
- Snell, C. V., Lee, J., Xu, K., & Kumar, A. (2024, October 4). Scaling LLM test-time compute optimally can be more effective than scaling parameters for reasoning. https://openreview.net/forum?id=4FWAwZtd2n
- Wang, X., & Zhou, D. (2024, November 6). Chain-of-Thought Reasoning Without Prompting. https://openreview.net/forum?id=4Zt7S0B0Jp
- Wu, T., Lan, J., Yuan, W., Jiao, J., Weston, J. E., & Sukhbaatar, S. (2025, June 18). Thinking LLMs: general instruction following with thought generation. https://openreview.net/forum?id=z6SrgYCdey¬eId=t3y0Ev0lm6
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
- Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. R. (2023, November 2). Tree of Thoughts: deliberate problem solving with large language models. https://openreview.net/forum?id=5Xc1ecxO1h
- Zelikman, E., Harik, G. R., Shao, Y., Jayasiri, V., Haber, N., & Goodman, N. (2024, August 26). Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking. https://openreview.net/forum?id=oRXPiSOGH9
- Zelikman, E., Wu, Y., Mu, J., & Goodman, N. (2022, October 31). STaR: bootstrapping reasoning with reasoning. https://openreview.net/forum?id=_3ELRdg2sgI
機械学習-LLM-推論スケーリング (.bib)
- Yue, Z., Zhuang, H., Bai, A., Hui, K., Jagerman, R., Zeng, H., Qin, Z., Wang, D., Wang, X., & Bendersky, M. (2024, October 4). Inference Scaling for Long-Context Retrieval Augmented Generation. https://openreview.net/forum?id=FSjIrOm1vz
機械学習-LLM-文脈内学習 (.bib)
- Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
- Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
- Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
- Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ
機械学習-LLM-評価 (.bib)
- Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387
機械学習-LLM (.bib)
- 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
- 7群7編 分散協調とエージェント – 電子情報通信学会知識ベース. Retrieved July 5, 2025, from https://www.ieice-hbkb.org/portal/7%e7%be%a4%e3%80%80%e3%82%b3%e3%83%b3%e3%83%94%e3%83%a5%e3%83%bc%e3%82%bf-%ef%bd%bf%ef%be%8c%ef%be%84%ef%bd%b3%ef%bd%aa%ef%bd%b1/7%e7%be%a47%e7%b7%a8-%e5%88%86%e6%95%a3%e5%8d%94%e8%aa%bf%e3%81%a8%e3%82%a8%e3%83%bc%e3%82%b8%e3%82%a7%e3%83%b3%e3%83%88/
- Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
- Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
- Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., … Kaplan, J. (2022). Constitutional AI: harmlessness from AI feedback. https://doi.org/10.48550/arXiv.2212.08073
- Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T., & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. https://doi.org/10.48550/arXiv.2303.12712
- Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
- DeepSeek-AI, Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z. F., Gou, Z., Shao, Z., Li, Z., Gao, Z., … Zhang, Z. (2025). DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. https://doi.org/10.48550/arXiv.2501.12948
- Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU
- 電子情報通信学会. (2019). 7群-7編-1章 エージェントの定義・モデル・概念. In 電子情報通信学会知識ベース (ver.1 ed., Number 7群7編). https://www.ieice-hbkb.org/files/ad_base/view_pdf.html?p=/files/07/07gun_07hen_01.pdf
- Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
- The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: an early look at the labor market impact potential of large language models. https://doi.org/10.48550/arXiv.2303.10130
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2024). GPTs are GPTs: Labor market impact potential of LLMs. Science, 384(6702), 1306–1308. https://doi.org/10.1126/science.adj0998
- Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
- Fedus, W., Zoph, B., & Shazeer, N. (2022). Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23(1), 120:5232–5120:5270.
- Gemini Team, Google. (2025). Gemini 2.5: pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities. https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf
- Guan, M. Y., Joglekar, M., Wallace, E., Jain, S., Barak, B., Helyar, A., Dias, R., Vallone, A., Ren, H., Wei, J., Chung, H. W., Toyer, S., Heidecke, J., Beutel, A., & Glaese, A. (2025). Deliberative alignment: reasoning enables safer language models. https://doi.org/10.48550/arXiv.2412.16339
- Hayou, S., Ghosh, N., & Yu, B. (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proceedings of the 41st International Conference on Machine Learning, 17783–17806. https://proceedings.mlr.press/v235/hayou24a.html
- 横井 祥. (2024). 「確率的なオウム」にできること、またそれがなぜできるのかについて. https://speakerdeck.com/eumesy/language-models-as-modern-version-of-the-use-theory-of-meaning
- Henighan, T., Kaplan, J., Katz, M., Chen, M., Hesse, C., Jackson, J., Jun, H., Brown, T. B., Dhariwal, P., Gray, S., Hallacy, C., Mann, B., Radford, A., Ramesh, A., Ryder, N., Ziegler, D. M., Schulman, J., Amodei, D., & McCandlish, S. (2020). Scaling laws for autoregressive generative modeling. https://doi.org/10.48550/arXiv.2010.14701
- Hernandez, D., Kaplan, J., Henighan, T., & McCandlish, S. (2021). Scaling laws for transfer. https://doi.org/10.48550/arXiv.2102.01293
- An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
- Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
- Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2019, September 25). The curious case of neural text degeneration. https://openreview.net/forum?id=rygGQyrFvH
- Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2022). LoRA: Low-rank adaptation of large language models. International Conference on Learning Representations. https://openreview.net/forum?id=nZeVKeeFYf9
- Jin, M., Yu, Q., Shu, D., Zhao, H., Hua, W., Meng, Y., Zhang, Y., & Du, M. (2024). The impact of reasoning step length on large language models (L.-W. Ku, A. Martins, & V. Srikumar, Eds.; pp. 1830–1842). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.108
- John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
- Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
- Khan, F. (2025). FareedKhan-dev/train-deepseek-r1. https://github.com/FareedKhan-dev/train-deepseek-r1
- Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
- Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
- Lambert, N., Morrison, J., Pyatkin, V., Huang, S., Ivison, H., Brahman, F., Miranda, L. J. V., Liu, A., Dziri, N., Lyu, S., Gu, Y., Malik, S., Graf, V., Hwang, J. D., Yang, J., Bras, R. L., Tafjord, O., Wilhelm, C., Soldaini, L., … Hajishirzi, H. (2025). Tulu 3: Pushing Frontiers in Open Language Model Post-Training. https://doi.org/10.48550/arXiv.2411.15124
- Learn Prompting: Your Guide to Communicating with AI. (2023). https://learnprompting.org/
- Lee, K.-H., Fischer, I., Wu, Y.-H., Marwood, D., Baluja, S., Schuurmans, D., & Chen, X. (2025). Evolving deeper LLM thinking. https://doi.org/10.48550/arXiv.2501.09891
- Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
- Liu, R., Gao, J., Zhao, J., Zhang, K., Li, X., Qi, B., Ouyang, W., & Zhou, B. (2025). Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling. https://doi.org/10.48550/arXiv.2502.06703
- Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
- Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
- Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
- Muennighoff, N., Yang, Z., Shi, W., Li, X. L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E., & Hashimoto, T. (2025). s1: Simple test-time scaling. https://doi.org/10.48550/arXiv.2501.19393
- Muennighoff, N., Rush, A. M., Barak, B., Scao, T. L., Tazi, N., Piktus, A., Pyysalo, S., Wolf, T., & Raffel, C. (2023, November 2). Scaling data-constrained language models. https://openreview.net/forum?id=j5BuTrEj35
- OpenAI Platform. Retrieved August 29, 2024, from https://platform.openai.com
- OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
- OpenAI. (2024). Learning to reason with LLMs. https://openai.com/index/learning-to-reason-with-llms/
- Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
- Prompt Engineering Guide – Nextra. Retrieved August 29, 2024, from https://www.promptingguide.ai/
- Rafailov, R., Hejna, J., Park, R., & Finn, C. (2024, August 26). From \r to {Q^*\: Your Language Model is Secretly a Q-Function. https://openreview.net/forum?id=kEVcNxtqXk
- Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
- Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
- A generalist agent. (2022). Transactions on Machine Learning Research. https://openreview.net/forum?id=1ikK0kHjvj
- Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
- Sardana, N., Portes, J., Doubov, S., & Frankle, J. (2024). Beyond Chinchilla-optimal: accounting for inference in language model scaling laws. Proceedings of the 41st International Conference on Machine Learning, 235, 43445–43460.
- Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
- Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., Li, Y., Gupta, A., Han, H. J., Schulhoff, S., Dulepet, P. S., Vidyadhara, S., Ki, D., Agrawal, S., Pham, C., Kroiz, G., Li, F., Tao, H., Srivastava, A., … Resnik, P. (2024). The Prompt Report: A Systematic Survey of Prompting Techniques. http://arxiv.org/abs/2406.06608
- Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
- 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
- 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
- Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. R., & Yao, S. (2023, November 2). Reflexion: language agents with verbal reinforcement learning. https://openreview.net/forum?id=vAElhFcKW6
- Singhal, S., Zeng, J., Bukharin, A., Zhang, Y., Shen, G., Mahabaleshwarkar, A. S., Kartal, B., Suhara, Y., Bercovich, A., Levy, I., Golan, I., Dabbah, M., El-Yaniv, R., Majumdar, S., Gitman, I., Bakhturina, E., Zhang, J. J., Su, B.-Y., Huang, G., … Konuk, T. (2025, June 21). Llama-Nemotron: Efficient Reasoning Models. https://openreview.net/forum?id=ev1xpo9mbI&referrer=%5Bthe%20profile%20of%20Olivier%20Delalleau%5D(%2Fprofile%3Fid%3D Olivier_Delalleau1)
- Snell, C. V., Lee, J., Xu, K., & Kumar, A. (2024, October 4). Scaling LLM test-time compute optimally can be more effective than scaling parameters for reasoning. https://openreview.net/forum?id=4FWAwZtd2n
- Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
- Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
- Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
- Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
- Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
- Wang, X., & Zhou, D. (2024, November 6). Chain-of-Thought Reasoning Without Prompting. https://openreview.net/forum?id=4Zt7S0B0Jp
- Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387
- Wang, Y., Yang, Q., Zeng, Z., Ren, L., Liu, L., Peng, B., Cheng, H., He, X., Wang, K., Gao, J., Chen, W., Wang, S., Du, S. S., & Shen, Y. (2025). Reinforcement learning for reasoning in large language models with one training example. https://doi.org/10.48550/arXiv.2504.20571
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.
- Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
- Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ
- Wen, X., Liu, Z., Zheng, S., Xu, Z., Ye, S., Wu, Z., Liang, X., Wang, Y., Li, J., Miao, Z., Bian, J., & Yang, M. (2025). Reinforcement learning with verifiable rewards implicitly incentivizes correct reasoning in base LLMs. https://doi.org/10.48550/arXiv.2506.14245
- Wu, T., Lan, J., Yuan, W., Jiao, J., Weston, J. E., & Sukhbaatar, S. (2025, June 18). Thinking LLMs: general instruction following with thought generation. https://openreview.net/forum?id=z6SrgYCdey¬eId=t3y0Ev0lm6
- 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
- Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
- Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
- Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
- Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. R. (2023, November 2). Tree of Thoughts: deliberate problem solving with large language models. https://openreview.net/forum?id=5Xc1ecxO1h
- Yue, Z., Zhuang, H., Bai, A., Hui, K., Jagerman, R., Zeng, H., Qin, Z., Wang, D., Wang, X., & Bendersky, M. (2024, October 4). Inference Scaling for Long-Context Retrieval Augmented Generation. https://openreview.net/forum?id=FSjIrOm1vz
- Zelikman, E., Harik, G. R., Shao, Y., Jayasiri, V., Haber, N., & Goodman, N. (2024, August 26). Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking. https://openreview.net/forum?id=oRXPiSOGH9
- Zelikman, E., Wu, Y., Mu, J., & Goodman, N. (2022, October 31). STaR: bootstrapping reasoning with reasoning. https://openreview.net/forum?id=_3ELRdg2sgI
- Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
- Zhao, X., Kang, Z., Feng, A., Levine, S., & Song, D. (2025). Learning to reason without external rewards. https://doi.org/10.48550/arXiv.2505.19590
- Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao
- Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he
機械学習-LLMs-FineTuning (.bib)
- Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU
機械学習-LLMs-LLM-as-a-judge (.bib)
- Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
- Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
- Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao
機械学習-LLMs-RLHF (.bib)
- Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
機械学習-LLMs-Training (.bib)
- The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
- Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
機械学習-LLMs-アラインメント (.bib)
- 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
- Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
- Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
- Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
- Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
- John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
- Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
- Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
- OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
- Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
- Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
- Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
- Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
- Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
- Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
- 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
- Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he
機械学習-LLMs-スケーリング則 (.bib)
- Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
- Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
- Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
- Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
- Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
- Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
- Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
- Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/
機械学習-LLMs-プロンプティング (.bib)
- Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
- Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.
機械学習-LLMs-全般 (.bib)
- Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
- An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
- 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
- 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
機械学習-LLMs-文脈内学習 (.bib)
- Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
- Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
- Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
- Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ
機械学習-LLMs-評価 (.bib)
- Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387
機械学習-LLMs (.bib)
- 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
- Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
- Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
- Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T., & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. https://doi.org/10.48550/arXiv.2303.12712
- Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
- Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU
- Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
- The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models. https://doi.org/10.48550/arXiv.2303.10130
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2024). GPTs are GPTs: Labor market impact potential of LLMs. Science, 384(6702), 1306–1308. https://doi.org/10.1126/science.adj0998
- Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
- Hayou, S., Ghosh, N., & Yu, B. (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proceedings of the 41st International Conference on Machine Learning, 17783–17806. https://proceedings.mlr.press/v235/hayou24a.html
- 横井 祥. (2024). 「確率的なオウム」にできること、またそれがなぜできるのかについて. https://speakerdeck.com/eumesy/language-models-as-modern-version-of-the-use-theory-of-meaning
- An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
- Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
- Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2022). LoRA: Low-rank adaptation of large language models. International Conference on Learning Representations. https://openreview.net/forum?id=nZeVKeeFYf9
- John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
- Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
- Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
- Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
- Learn Prompting: Your Guide to Communicating with AI. (2023). https://learnprompting.org/
- Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
- Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
- Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
- Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
- OpenAI Platform. Retrieved August 29, 2024, from https://platform.openai.com
- OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
- Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
- Prompt Engineering Guide – Nextra. Retrieved August 29, 2024, from https://www.promptingguide.ai/
- Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
- Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
- Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
- Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
- Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., Li, Y., Gupta, A., Han, H. J., Schulhoff, S., Dulepet, P. S., Vidyadhara, S., Ki, D., Agrawal, S., Pham, C., Kroiz, G., Li, F., Tao, H., Srivastava, A., … Resnik, P. (2024). The Prompt Report: A Systematic Survey of Prompting Techniques. http://arxiv.org/abs/2406.06608
- Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
- 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
- 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
- Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
- Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
- Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
- Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
- Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
- Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.
- Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
- Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ
- 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
- Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
- Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
- Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/
- Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
- Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao
- Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he
機械学習-Metrics (.bib)
機械学習-タスク-分析・発見系-クラスタリング (.bib)
- Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
- Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
- C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
- Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
- A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. (2022). Engineering Applications of Artificial Intelligence, 110, 104743. https://doi.org/10.1016/j.engappai.2022.104743
- Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining.
- Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
- Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
- Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
- Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
- Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst., 42(3), 19:1–19:21. https://doi.org/10.1145/3068335
- SIGKDD News: 2014 SIGKDD Test of Time Award. (2014). https://www.kdd.org/News/view/2014-sigkdd-test-of-time-award
- Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845
- Xu, D., & Tian, Y. (2015). A Comprehensive Survey of Clustering Algorithms. Annals of Data Science, 2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1
機械学習-タスク-分析・発見系-スパース表現獲得 (.bib)
- Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
- 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/
機械学習-タスク-分析・発見系-全般 (.bib)
- Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
- 石井 健一郎, & 上田 修功. (2014). 教師なし学習入門 (わかりやすいパターン認識 続). オーム社.
機械学習-タスク-分析・発見系-変数選択 (.bib)
- Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
- Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406
機械学習-タスク-分析・発見系-次元削減 (.bib)
- Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
- Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
- Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT
機械学習-タスク-分析・発見系-異常検知 (.bib)
- Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
- Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
- Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
- Pelletier, B. (2024). On the statistical properties of the isolation forest anomaly detection method. https://hal.science/hal-04430185
機械学習-タスク-分析・発見系-選択的推論 (.bib)
- Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915
機械学習-タスク-分析・発見系 (.bib)
- Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
- Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
- Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
- C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
- Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
- Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
- A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. (2022). Engineering Applications of Artificial Intelligence, 110, 104743. https://doi.org/10.1016/j.engappai.2022.104743
- Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
- Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining.
- Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
- 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/
- Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
- Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
- Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
- Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
- Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
- Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915
- Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
- Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT
- Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
- Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
- Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406
- Pelletier, B. (2024). On the statistical properties of the isolation forest anomaly detection method. https://hal.science/hal-04430185
- Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst., 42(3), 19:1–19:21. https://doi.org/10.1145/3068335
- 石井 健一郎, & 上田 修功. (2014). 教師なし学習入門 (わかりやすいパターン認識 続). オーム社.
- SIGKDD News: 2014 SIGKDD Test of Time Award. (2014). https://www.kdd.org/News/view/2014-sigkdd-test-of-time-award
- Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845
- Xu, D., & Tian, Y. (2015). A Comprehensive Survey of Clustering Algorithms. Annals of Data Science, 2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1
機械学習-タスク-生成系 (.bib)
- Chan, C., Ginosar, S., Zhou, T., & Efros, A. (2019). Everybody dance now. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 5932–5941. https://doi.org/10.1109/ICCV.2019.00603
- Dathathri, S., See, A., Ghaisas, S., Huang, P.-S., McAdam, R., Welbl, J., Bachani, V., Kaskasoli, A., Stanforth, R., Matejovicova, T., Hayes, J., Vyas, N., Merey, M. A., Brown-Cohen, J., Bunel, R., Balle, B., Cemgil, T., Ahmed, Z., Stacpoole, K., … Kohli, P. (2024). Scalable watermarking for identifying large language model outputs. Nature, 634(8035), 818–823. https://doi.org/10.1038/s41586-024-08025-4
- Gottesman, Y. (2023). Understand diffusion models with VAEs. Yoni Gottesman. https://yonigottesman.github.io/2023/03/11/vae.html
- Ho, J., Jain, A., & Abbeel, P. Denoising diffusion probabilistic models.
- Song, Y., & Ermon, S. (2019). Generative modeling by estimating gradients of the data distribution. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1067, pp. 11918–11930). Curran Associates Inc.
- Song, Y. (2021). Generative modeling by estimating gradients of the data distribution. Yang Song. https://yang-song.net/blog/2021/score
- Song, Y., Durkan, C., Murray, I., & Ermon, S. (2021, November 9). Maximum likelihood training of score-based diffusion models. https://openreview.net/forum?id=AklttWFnxS9
- Vincent, P. (2011). A connection between score matching and denoising autoencoders. Neural Computation, 23(7), 1661–1674. https://doi.org/10.1162/NECO_a_00142
機械学習-タスク−分析・発見系-クラスタリング (.bib)
- Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
- Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
- C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
- Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
- Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
- Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
- Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
- Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
- Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845
機械学習-タスク−分析・発見系-スパース表現獲得 (.bib)
- Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
- 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/
機械学習-タスク−分析・発見系-全般 (.bib)
- Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
機械学習-タスク−分析・発見系-変数選択 (.bib)
- Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
- Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406
機械学習-タスク−分析・発見系-次元削減 (.bib)
- Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
- Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
- Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT
機械学習-タスク−分析・発見系-異常検知 (.bib)
- Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
- Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
- Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
機械学習-タスク−分析・発見系-選択的推論 (.bib)
- Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915
機械学習-タスク−分析・発見系 (.bib)
- Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
- Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
- Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
- C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
- Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
- Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
- Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
- Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
- 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/
- Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
- Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
- Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
- Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
- Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
- Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915
- Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
- Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT
- Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
- Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
- Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406
- Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845
機械学習-データの性質の仮説 (.bib)
機械学習-トピックス-Conformal prediction (.bib)
- Angelopoulos, A. N., & Bates, S. (2022). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. http://arxiv.org/abs/2107.07511
- Vovk, V., Gammerman, A., & Shafer, G. (2022). Algorithmic Learning in a Random World. Springer International Publishing. https://doi.org/10.1007/978-3-031-06649-8
機械学習-トピックス (.bib)
- Angelopoulos, A. N., & Bates, S. (2022). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. http://arxiv.org/abs/2107.07511
- Vovk, V., Gammerman, A., & Shafer, G. (2022). Algorithmic Learning in a Random World. Springer International Publishing. https://doi.org/10.1007/978-3-031-06649-8
機械学習-ドメイン-テキスト (.bib)
- Ali, M., Fromm, M., Thellmann, K., Rutmann, R., Lübbering, M., Leveling, J., Klug, K., Ebert, J., Doll, N., Buschhoff, J., Jain, C., Weber, A., Jurkschat, L., Abdelwahab, H., John, C., Ortiz Suarez, P., Ostendorff, M., Weinbach, S., Sifa, R., … Flores-Herr, N. (2024). Tokenizer choice for LLM training: negligible or crucial? In K. Duh, H. Gomez, & S. Bethard (Eds.), Findings of the Association for Computational Linguistics: NAACL 2024 (pp. 3907–3924). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-naacl.247
- Arnett, C., Chang, T. A., & Bergen, B. (2024). A Bit of a Problem: Measurement Disparities in Dataset Sizes across Languages. In M. Melero, S. Sakti, & C. Soria (Eds.), Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024 (pp. 1–9). ELRA and ICCL. https://aclanthology.org/2024.sigul-1.1/
- Arnett, C., & Bergen, B. (2025). Why do language models perform worse for morphologically complex languages? In O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, & S. Schockaert (Eds.), Proceedings of the 31st International Conference on Computational Linguistics (pp. 6607–6623). Association for Computational Linguistics. https://aclanthology.org/2025.coling-main.441/
- Arora, S., Liang, Y., & Ma, T. (2017, February 6). A simple but tough-to-beat baseline for sentence embeddings. https://openreview.net/forum?id=SyK00v5xx
- Chen, S., Wong, S., Chen, L., & Tian, Y. (2023). Extending context window of large language models via positional interpolation. https://doi.org/10.48550/arXiv.2306.15595
- Distributional Hypothesis - ACL Wiki. Retrieved June 22, 2025, from https://www.aclweb.org/aclwiki/index.php?title=Distributional_Hypothesis
- Dutta, D., Ansari, F., Chakrabarty, A., & Das, S. (2025). On the existence of universal simulators of attention. https://doi.org/10.48550/arXiv.2506.18739
- Hutchins, J. (1995). "The whisky was invisible", or Persistent myths of MT. MT News International, 11. https://web.archive.org/web/20210103041306/http://www.hutchinsweb.me.uk/MTNI-11-1995.pdf
- Kallini, J., Papadimitriou, I., Futrell, R., Mahowald, K., & Potts, C. (2024). Mission: Impossible language models. In L.-W. Ku, A. Martins, & V. Srikumar (Eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 14691–14714). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-long.787
- Lu, Y., & Morgan, J. L. (2020). Homophone auditory processing in cross-linguistic perspective. Proceedings of the Linguistic Society of America, 5(1), 529–542. https://doi.org/10.3765/plsa.v5i1.4733
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, 2, 3111–3119.
- 坪井 祐太, 海野 裕也, & 鈴木 潤. (2017). 深層学習による自然言語処理. 講談社.
- Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr., 3(4), 333–389. https://doi.org/10.1561/1500000019
- R, F. I. R. T. H. J. (1957). A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis. https://cir.nii.ac.jp/crid/1570854175539816192?lang=en
- 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
- 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
- Speech and Language Processing. Retrieved June 18, 2025, from https://web.stanford.edu/ jurafsky/slp3/
- Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., & Liu, Y. (2024). RoFormer: enhanced transformer with rotary position embedding. Neurocomputing, 568(C). https://doi.org/10.1016/j.neucom.2023.127063
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000–6010.
- Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (T. Linzen, G. Chrupała, & A. Alishahi, Eds.; pp. 353–355). Association for Computational Linguistics. https://doi.org/10.18653/v1/W18-5446
- Warner, B., Chaffin, A., Clavié, B., Weller, O., Hallström, O., Taghadouini, S., Gallagher, A., Biswas, R., Ladhak, F., Aarsen, T., Cooper, N., Adams, G., Howard, J., & Poli, I. (2024). Smarter, better, faster, longer: a modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference. https://doi.org/10.48550/arXiv.2412.13663
- Wettig, A., Gao, T., Zhong, Z., & Chen, D. (2023). Should you mask 15% in masked language modeling? In A. Vlachos & I. Augenstein (Eds.), Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (pp. 2985–3000). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.eacl-main.217
- Xiong, R., Yang, Y., He, D., Zheng, K., Zheng, S., Xing, C., Zhang, H., Lan, Y., Wang, L., & Liu, T.-Y. (2020). On layer normalization in the transformer architecture. Proceedings of the 37th International Conference on Machine Learning, 119, 10524–10533.
- 岩田 具治. (2015). トピックモデル. 講談社.
- Zhang, Y., & Teng, Z. (2021). Natural Language Processing: A Machine Learning Perspective. Cambridge University Press.
- Zhang, B., & Sennrich, R. (2019). Root mean square layer normalization. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1110, pp. 12381–12392). Curran Associates Inc.
機械学習-ドメイン-ネットワーク (.bib)
- Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111. https://doi.org/10.1103/PhysRevE.70.066111
- Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826. https://doi.org/10.1073/pnas.122653799
- Newman, M. E. J. (2004). Fast algorithm for detecting community structure in networks. Physical Review E, 69(6), 066133. https://doi.org/10.1103/PhysRevE.69.066133
- Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113. https://doi.org/10.1103/PhysRevE.69.026113
- Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
- Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
- Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33(4), 452–473. http://www.jstor.org/stable/3629752
- 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.
機械学習-ドメイン-マルチモーダル (.bib)
- Be My Eyes (Ed.). (2024). Be My Eyes Accessibility with GPT-4o. https://www.youtube.com/watch?v=Zq710AKC1gg
- Fu, L., Yang, B., Kuang, Z., Song, J., Li, Y., Zhu, L., Luo, Q., Wang, X., Lu, H., Huang, M., Li, Z., Tang, G., Shan, B., Lin, C., Liu, Q., Wu, B., Feng, H., Liu, H., Huang, C., … Bai, X. (2024). OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning (Version 1). https://doi.org/10.48550/arXiv.2501.00321
- Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learning, 8748–8763. https://proceedings.mlr.press/v139/radford21a.html
- Neural discrete representation learning. (2017). Proceedings of the 31st International Conference on Neural Information Processing Systems, 6309–6318.
- Xu, H., Xie, S., Tan, X., Huang, P.-Y., Howes, R., Sharma, V., Li, S.-W., Ghosh, G., Zettlemoyer, L., & Feichtenhofer, C. (2023, October 13). Demystifying CLIP data. https://openreview.net/forum?id=5BCFlnfE1g
- Zhou, C., Yu, L., Babu, A., Tirumala, K., Yasunaga, M., Shamis, L., Kahn, J., Ma, X., Zettlemoyer, L., & Levy, O. (2024, October 4). Transfusion: predict the next token and diffuse images with one multi-modal model. https://openreview.net/forum?id=SI2hI0frk6
機械学習-ドメイン-化学 (.bib)
- Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2
機械学習-ドメイン-画像 (.bib)
- Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html
- Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. https://doi.org/10.1109/CVPR.2009.5206848
- drhead. (2024). The VAE used for Stable Diffusion 1.x/2.x and other models (KL-F8) has a critical flaw, probably due to bad training, that is holding back all models that use it (almost certainly including DALL-E 3). [Reddit Post]. r/StableDiffusion. https://www.reddit.com/r/StableDiffusion/comments/1ag5h5s/the_vae_used_for_stable_diffusion_1x2x_and_other/
- Entezari, R., Wortsman, M., Saukh, O., Shariatnia, M. M., Sedghi, H., & Schmidt, L. (2023). The role of pre-training data in transfer learning. https://doi.org/10.48550/arXiv.2302.13602
- Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018, September 27). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. https://openreview.net/forum?id=Bygh9j09KX
- Hermann, K., Mobahi, H., Fel, T., & Mozer, M. C. (2023, October 13). On the foundations of shortcut learning. https://openreview.net/forum?id=Tj3xLVuE9f
- Jocher, G., Qiu, J., & Chaurasia, A. (2023). Ultralytics YOLO (Version 8.0.0). https://github.com/ultralytics/ultralytics
- Kim, J. (2025). kjsman/stable-diffusion-pytorch. https://github.com/kjsman/stable-diffusion-pytorch
- Odena, A., Dumoulin, V., & Olah, C. (2016). Deconvolution and checkerboard artifacts. Distill, 1(10), e3. https://doi.org/10.23915/distill.00003
- Plesner, A., Vontobel, T., & Wattenhofer, R. (2024). Breaking reCAPTCHAv2. https://doi.org/10.1109/COMPSAC61105.2024.00142
- Raghu, M., Zhang, C., Kleinberg, J., & Bengio, S. (2019). Transfusion: understanding transfer learning for medical imaging. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 301, pp. 3347–3357). Curran Associates Inc.
- Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. 10674–10685. https://doi.org/10.1109/CVPR52688.2022.01042
- Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (pp. 234–241). Springer International Publishing. https://doi.org/10.1007/978-3-319-24574-4_28
- Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252. https://doi.org/10.1007/s11263-015-0816-y
- Sharif, M., Bhagavatula, S., Bauer, L., & Reiter, M. K. (2016). Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 1528–1540. https://doi.org/10.1145/2976749.2978392
- Szeliski, R. (2022). Computer Vision: Algorithms and Applications. Springer. https://szeliski.org/Book/
- THU-MIG/yolov10. (2024). THU-MIG. https://github.com/THU-MIG/yolov10
- Ultralytics. Ultralytics YOLO11 object detection model. Retrieved June 12, 2025, from https://github.com/ultralytics/ultralytics/blob/da98efc61d9e0467315fc86c2297c8d81e656b1a/ultralytics/cfg/models/11/yolo11.yaml
- Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. https://doi.org/10.48550/arXiv.2405.14458
- Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. Advances in Neural Information Processing Systems, 37, 107984–108011. https://proceedings.neurips.cc/paper_files/paper/2024/hash/c34ddd05eb089991f06f3c5dc36836e0-Abstract-Conference.html
- Wu, Y., & He, K. (2018). Group normalization. Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XIII, 3–19. https://doi.org/10.1007/978-3-030-01261-8_1
- 原田 達也. (2017). 画像認識. 講談社. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003372347
機械学習-モデル - ニューラルネットワーク (.bib)
- Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2021). Sharpness-Aware Minimization for Efficiently Improving Generalization. https://doi.org/10.48550/arXiv.2010.01412
- Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. https://doi.org/10.48550/arXiv.1207.0580
機械学習-モデル-k最近傍法 (.bib)
- Abu Alfeilat, H. A., Hassanat, A. B. A., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., Eyal Salman, H. S., & Prasath, V. B. S. (2019). Effects of distance measure choice on K-nearest neighbor classifier performance: a review. Big Data, 7(4), 221–248. https://doi.org/10.1089/big.2018.0175
- Aumüller, M., Bernhardsson, E., & Faithfull, A. (2018). ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. https://doi.org/10.48550/arXiv.1807.05614
- Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. https://doi.org/10.48550/arXiv.1603.09320
- Parry, R. M., Jones, W., Stokes, T. H., Phan, J. H., Moffitt, R. A., Fang, H., Shi, L., Oberthuer, A., Fischer, M., Tong, W., & Wang, M. D. (2010). k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. The Pharmacogenomics Journal, 10(4), 292–309. https://doi.org/10.1038/tpj.2010.56
- Todeschini, R., Ballabio, D., & Consonni, V. (2020). Distances and similarity measures in chemometrics and chemoinformatics. In Encyclopedia of Analytical Chemistry (pp. 1–40). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470027318.a9438.pub2
機械学習-モデル-ニューラルネットワーク (.bib)
- Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2021). Sharpness-Aware Minimization for Efficiently Improving Generalization. https://doi.org/10.48550/arXiv.2010.01412
- Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. Proceedings of the 34th International Conference on Machine Learning - Volume 70, 1321–1330.
- Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. https://doi.org/10.48550/arXiv.1207.0580
- Oikarinen, T., & Weng, T.-W. (2022, September 29). CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks. https://openreview.net/forum?id=iPWiwWHc1V
- Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. https://doi.org/10.48550/arXiv.1611.03530
機械学習-モデル-ノンパラメトリックモデル (.bib)
- Belkin, M., Rakhlin, A., & Tsybakov, A. B. (2019). Does data interpolation contradict statistical optimality? The 22nd International Conference on Artificial Intelligence and Statistics, 1611–1619. http://proceedings.mlr.press/v89/belkin19a.html
機械学習-モデル-線型モデル (.bib)
- Seber, G. A. F., & Lee, A. J. (2003). Linear Regression Analysis (2nd ed.). Wiley.
機械学習-モデル-過剰パラメーターモデル (.bib)
- Belkin, M., Rakhlin, A., & Tsybakov, A. B. (2019). Does data interpolation contradict statistical optimality? The 22nd International Conference on Artificial Intelligence and Statistics, 1611–1619. http://proceedings.mlr.press/v89/belkin19a.html
- Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018). Visualizing the loss landscape of neural nets. Advances in Neural Information Processing Systems, 31. https://proceedings.neurips.cc/paper_files/paper/2018/hash/a41b3bb3e6b050b6c9067c67f663b915-Abstract.html
機械学習-モデル (.bib)
- Abu Alfeilat, H. A., Hassanat, A. B. A., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., Eyal Salman, H. S., & Prasath, V. B. S. (2019). Effects of distance measure choice on K-nearest neighbor classifier performance: a review. Big Data, 7(4), 221–248. https://doi.org/10.1089/big.2018.0175
- Aumüller, M., Bernhardsson, E., & Faithfull, A. (2018). ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. https://doi.org/10.48550/arXiv.1807.05614
- Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Y. Bengio & Y. LeCun (Eds.), Conference Track Proceedings of the 3rd International Conference on Learning Representations. http://arxiv.org/abs/1409.0473
- Belkin, M., Rakhlin, A., & Tsybakov, A. B. (2019). Does data interpolation contradict statistical optimality? The 22nd International Conference on Artificial Intelligence and Statistics, 1611–1619. http://proceedings.mlr.press/v89/belkin19a.html
- Bridle, J. S. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. F. Soulié & J. Hérault (Eds.), Neurocomputing (pp. 227–236). Springer. https://doi.org/10.1007/978-3-642-76153-9_28
- Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2021). Sharpness-Aware Minimization for Efficiently Improving Generalization. https://doi.org/10.48550/arXiv.2010.01412
- Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. Proceedings of the 34th International Conference on Machine Learning - Volume 70, 1321–1330.
- Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. https://doi.org/10.48550/arXiv.1207.0580
- Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018). Visualizing the loss landscape of neural nets. Advances in Neural Information Processing Systems, 31. https://proceedings.neurips.cc/paper_files/paper/2018/hash/a41b3bb3e6b050b6c9067c67f663b915-Abstract.html
- Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. https://doi.org/10.48550/arXiv.1603.09320
- Oikarinen, T., & Weng, T.-W. (2022, September 29). CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks. https://openreview.net/forum?id=iPWiwWHc1V
- Parry, R. M., Jones, W., Stokes, T. H., Phan, J. H., Moffitt, R. A., Fang, H., Shi, L., Oberthuer, A., Fischer, M., Tong, W., & Wang, M. D. (2010). k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. The Pharmacogenomics Journal, 10(4), 292–309. https://doi.org/10.1038/tpj.2010.56
- Seber, G. A. F., & Lee, A. J. (2003). Linear Regression Analysis (2nd ed.). Wiley.
- Todeschini, R., Ballabio, D., & Consonni, V. (2020). Distances and similarity measures in chemometrics and chemoinformatics. In Encyclopedia of Analytical Chemistry (pp. 1–40). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470027318.a9438.pub2
- Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. https://doi.org/10.48550/arXiv.1611.03530
機械学習-不均衡データ (.bib)
- Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. Proceedings of the 24th International Conference on Machine Learning, 935–942. https://doi.org/10.1145/1273496.1273614
- Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2011). Class imbalance, redux. IEEE 11th International Conference on Data Mining, 754–763. https://doi.org/10.1109/ICDM.2011.33
機械学習-全般 (.bib)
- Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning (Vol. 1). Springer series in statistics New York. http://statweb.stanford.edu/ tibs/book/preface.ps
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. http://www.deeplearningbook.org
- James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An Introduction to Statistical Learning, with Applications in Python. Springer.
- Murphy, K. P. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. https://probml.github.io/pml-book/book1.html
- Raschka, S., & Mirjalili, V. (2020). Python機械学習プログラミング:達人データサイエンティストによる理論と実践 (福島 真太朗 & 株式会社クイープ, Trans.; 第3版). インプレス.
- Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
- Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
- Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3
- Zhou, Z.-H. (2021). Machine Learning. Springer Singapore. https://doi.org/10.1007/978-981-15-1967-3
- 周志 华. (2022). 機械学習 (大和田勇 人, 玄光 男, 下川朝 有, & 郝新 厂, Trans.). 近代科学社.
機械学習-基本の枠組み-Bayesian methods-Gaussian processes (.bib)
- Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
機械学習-基本の枠組み-Bayesian methods-Stochastic Gradient Langevin Dynamics (.bib)
- Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.
機械学習-基本の枠組み-Bayesian methods-Variational inference (.bib)
- Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
- Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
機械学習-基本の枠組み-Bayesian methods-全般 (.bib)
- Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
- 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
機械学習-基本の枠組み-Bayesian methods (.bib)
- Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
- 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
- Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
- Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
- Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
- Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.
機械学習-基本の枠組み-Hyperparameter tuning (.bib)
- Bergstra, J., & Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13.
機械学習-基本の枠組み-不均衡データ (.bib)
- Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. Proceedings of the 24th International Conference on Machine Learning, 935–942. https://doi.org/10.1145/1273496.1273614
- Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2011). Class imbalance, redux. IEEE 11th International Conference on Data Mining, 754–763. https://doi.org/10.1109/ICDM.2011.33
機械学習-基本の枠組み-事前学習 (.bib)
- Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html
機械学習-基本の枠組み-全般 (.bib)
- Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning (Vol. 1). Springer series in statistics New York. http://statweb.stanford.edu/ tibs/book/preface.ps
- James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An Introduction to Statistical Learning, with Applications in Python. Springer.
- Murphy, K. P. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. probml.ai
- Raschka, S., & Mirjalili, V. (2020). Python機械学習プログラミング:達人データサイエンティストによる理論と実践 (福島 真太朗 & 株式会社クイープ, Trans.; 第3版). インプレス.
- Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
- Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
- Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3
- Zhou, Z.-H. (2021). Machine Learning. Springer Singapore. https://doi.org/10.1007/978-981-15-1967-3
- 周志 华. (2022). 機械学習 (大和田勇 人, 玄光 男, 下川朝 有, & 郝新 厂, Trans.). 近代科学社.
機械学習-基本の枠組み-共通理解 (.bib)
- Goldblum, M., Finzi, M., Rowan, K., & Wilson, A. G. (2024). Position: the no free lunch theorem, Kolmogorov complexity, and the role of inductive biases in machine learning. Proceedings of the 41st International Conference on Machine Learning (Position Paper Track), 235, 15788–15808.
- No Free Lunch Theorems. Retrieved February 15, 2025, from http://www.no-free-lunch.org/
- Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
- Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
- Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3
機械学習-基本の枠組み-損失関数 (.bib)
- Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494), 746–762. https://doi.org/10.1198/jasa.2011.r10138
- Koenker, R. (2005). Quantile Regression (Number 38). Cambridge University Press.
- Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643
- Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P. H. S., & Dokania, P. K. (2020). Calibrating deep neural networks using focal loss. 15288–15299.
- Osband, K. H. (1985). Providing Incentives for Better Cost Forecasting [Phdthesis]. University of California, Berkeley.
- Steinwart, I., Pasin, C., Williamson, R., & Zhang, S. (2014). Elicitation and identification of properties. Proceedings of The 27th Conference on Learning Theory, 482–526. https://proceedings.mlr.press/v35/steinwart14.html
機械学習-基本の枠組み-統計的学習理論 (.bib)
- Bach, F. (2024). Learning Theory from First Principles. The MIT Press.
- Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
- 金森 敬文. (2015). 統計的学習理論. 講談社.
- Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of Machine Learning (2nd ed.). The MIT Press.
- Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. https://www.cs.huji.ac.il/ shais/UnderstandingMachineLearning/copy.html
- Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
- Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer New York. https://doi.org/10.1007/978-1-4757-3264-1
機械学習-基本の枠組み-表現学習 (.bib)
- Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018, September 27). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. https://openreview.net/forum?id=Bygh9j09KX
- Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z
- Hermann, K., Mobahi, H., Fel, T., & Mozer, M. C. (2023, October 13). On the foundations of shortcut learning. https://openreview.net/forum?id=Tj3xLVuE9f
機械学習-基本の枠組み-評価指標 (.bib)
- Brodersen, K. H., Ong, C. S., Stephan, K. E., & Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. 2010 20th International Conference on Pattern Recognition, 3121–3124. https://doi.org/10.1109/ICPR.2010.764
機械学習-基本の枠組み-類似度学習 (.bib)
- Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1), 2. https://doi.org/10.3390/technologies9010002
- Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb), 207–244. http://www.jmlr.org/papers/v10/weinberger09a.html
機械学習-基本の枠組み (.bib)
- Bach, F. (2024). Learning Theory from First Principles. The MIT Press.
- Bergstra, J., & Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13.
- Brodersen, K. H., Ong, C. S., Stephan, K. E., & Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. 2010 20th International Conference on Pattern Recognition, 3121–3124. https://doi.org/10.1109/ICPR.2010.764
- Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html
- Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
- Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
- 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
- Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018, September 27). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. https://openreview.net/forum?id=Bygh9j09KX
- Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z
- Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
- Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494), 746–762. https://doi.org/10.1198/jasa.2011.r10138
- Goldblum, M., Finzi, M., Rowan, K., & Wilson, A. G. (2024). Position: the no free lunch theorem, Kolmogorov complexity, and the role of inductive biases in machine learning. Proceedings of the 41st International Conference on Machine Learning (Position Paper Track), 235, 15788–15808.
- Hermann, K., Mobahi, H., Fel, T., & Mozer, M. C. (2023, October 13). On the foundations of shortcut learning. https://openreview.net/forum?id=Tj3xLVuE9f
- Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1), 2. https://doi.org/10.3390/technologies9010002
- 金森 敬文. (2015). 統計的学習理論. 講談社.
- Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
- Koenker, R. (2005). Quantile Regression (Number 38). Cambridge University Press.
- Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643
- Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of Machine Learning (2nd ed.). The MIT Press.
- Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P. H. S., & Dokania, P. K. (2020). Calibrating deep neural networks using focal loss. 15288–15299.
- No Free Lunch Theorems. Retrieved February 15, 2025, from http://www.no-free-lunch.org/
- Osband, K. H. (1985). Providing Incentives for Better Cost Forecasting [Phdthesis]. University of California, Berkeley.
- Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
- Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. https://www.cs.huji.ac.il/ shais/UnderstandingMachineLearning/copy.html
- Steinwart, I., Pasin, C., Williamson, R., & Zhang, S. (2014). Elicitation and identification of properties. Proceedings of The 27th Conference on Learning Theory, 482–526. https://proceedings.mlr.press/v35/steinwart14.html
- Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
- Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. Proceedings of the 24th International Conference on Machine Learning, 935–942. https://doi.org/10.1145/1273496.1273614
- Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer New York. https://doi.org/10.1007/978-1-4757-3264-1
- Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2011). Class imbalance, redux. IEEE 11th International Conference on Data Mining, 754–763. https://doi.org/10.1109/ICDM.2011.33
- Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb), 207–244. http://www.jmlr.org/papers/v10/weinberger09a.html
- Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.
- Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
- Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
- Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3
機械学習-損失関数 (.bib)
- Koenker, R. (2005). Quantile Regression (Number 38). Cambridge University Press.
- Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643
機械学習-統計的学習理論 (.bib)
- Bach, F. (2024). Learning Theory from First Principles. The MIT Press.
- 金森 敬文. (2015). 統計的学習理論. 講談社.
- Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of Machine Learning (2nd ed.). The MIT Press.
- Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. https://www.cs.huji.ac.il/ shais/UnderstandingMachineLearning/copy.html
- Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer New York. https://doi.org/10.1007/978-1-4757-3264-1
機械学習-解釈性 (.bib)
- Molnar, C. Interpretable Machine Learning. Retrieved February 23, 2025, from https://christophm.github.io/interpretable-ml-book/
- Molnar, C. Interpretable Machine Learning(邦訳). Retrieved February 23, 2025, from https://hacarus.github.io/interpretable-ml-book-ja/index.html
機械学習-評価指標 (.bib)
機械学習 (.bib)
- 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
- 7群7編 分散協調とエージェント – 電子情報通信学会知識ベース. Retrieved July 5, 2025, from https://www.ieice-hbkb.org/portal/7%e7%be%a4%e3%80%80%e3%82%b3%e3%83%b3%e3%83%94%e3%83%a5%e3%83%bc%e3%82%bf-%ef%bd%bf%ef%be%8c%ef%be%84%ef%bd%b3%ef%bd%aa%ef%bd%b1/7%e7%be%a47%e7%b7%a8-%e5%88%86%e6%95%a3%e5%8d%94%e8%aa%bf%e3%81%a8%e3%82%a8%e3%83%bc%e3%82%b8%e3%82%a7%e3%83%b3%e3%83%88/
- Abu Alfeilat, H. A., Hassanat, A. B. A., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., Eyal Salman, H. S., & Prasath, V. B. S. (2019). Effects of distance measure choice on K-nearest neighbor classifier performance: a review. Big Data, 7(4), 221–248. https://doi.org/10.1089/big.2018.0175
- Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
- Ali, M., Fromm, M., Thellmann, K., Rutmann, R., Lübbering, M., Leveling, J., Klug, K., Ebert, J., Doll, N., Buschhoff, J., Jain, C., Weber, A., Jurkschat, L., Abdelwahab, H., John, C., Ortiz Suarez, P., Ostendorff, M., Weinbach, S., Sifa, R., … Flores-Herr, N. (2024). Tokenizer choice for LLM training: negligible or crucial? In K. Duh, H. Gomez, & S. Bethard (Eds.), Findings of the Association for Computational Linguistics: NAACL 2024 (pp. 3907–3924). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-naacl.247
- Angelopoulos, A. N., & Bates, S. (2022). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. http://arxiv.org/abs/2107.07511
- Arnett, C., Chang, T. A., & Bergen, B. (2024). A Bit of a Problem: Measurement Disparities in Dataset Sizes across Languages. In M. Melero, S. Sakti, & C. Soria (Eds.), Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024 (pp. 1–9). ELRA and ICCL. https://aclanthology.org/2024.sigul-1.1/
- Arnett, C., & Bergen, B. (2025). Why do language models perform worse for morphologically complex languages? In O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, & S. Schockaert (Eds.), Proceedings of the 31st International Conference on Computational Linguistics (pp. 6607–6623). Association for Computational Linguistics. https://aclanthology.org/2025.coling-main.441/
- Arora, S., Liang, Y., & Ma, T. (2017, February 6). A simple but tough-to-beat baseline for sentence embeddings. https://openreview.net/forum?id=SyK00v5xx
- Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
- Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
- Aumüller, M., Bernhardsson, E., & Faithfull, A. (2018). ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. https://doi.org/10.48550/arXiv.1807.05614
- Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
- Bach, F. (2024). Learning Theory from First Principles. The MIT Press.
- Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Y. Bengio & Y. LeCun (Eds.), Conference Track Proceedings of the 3rd International Conference on Learning Representations. http://arxiv.org/abs/1409.0473
- Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., … Kaplan, J. (2022). Constitutional AI: harmlessness from AI feedback. https://doi.org/10.48550/arXiv.2212.08073
- Belkin, M., Rakhlin, A., & Tsybakov, A. B. (2019). Does data interpolation contradict statistical optimality? The 22nd International Conference on Artificial Intelligence and Statistics, 1611–1619. http://proceedings.mlr.press/v89/belkin19a.html
- Be My Eyes (Ed.). (2024). Be My Eyes Accessibility with GPT-4o. https://www.youtube.com/watch?v=Zq710AKC1gg
- Bergstra, J., & Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13.
- Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
- Bridle, J. S. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. F. Soulié & J. Hérault (Eds.), Neurocomputing (pp. 227–236). Springer. https://doi.org/10.1007/978-3-642-76153-9_28
- Brodersen, K. H., Ong, C. S., Stephan, K. E., & Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. 2010 20th International Conference on Pattern Recognition, 3121–3124. https://doi.org/10.1109/ICPR.2010.764
- Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T., & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. https://doi.org/10.48550/arXiv.2303.12712
- C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
- Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
- Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
- Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
- Chan, C., Ginosar, S., Zhou, T., & Efros, A. (2019). Everybody dance now. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 5932–5941. https://doi.org/10.1109/ICCV.2019.00603
- Chen, S., Wong, S., Chen, L., & Tian, Y. (2023). Extending context window of large language models via positional interpolation. https://doi.org/10.48550/arXiv.2306.15595
- Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html
- Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111. https://doi.org/10.1103/PhysRevE.70.066111
- A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. (2022). Engineering Applications of Artificial Intelligence, 110, 104743. https://doi.org/10.1016/j.engappai.2022.104743
- Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
- Dathathri, S., See, A., Ghaisas, S., Huang, P.-S., McAdam, R., Welbl, J., Bachani, V., Kaskasoli, A., Stanforth, R., Matejovicova, T., Hayes, J., Vyas, N., Merey, M. A., Brown-Cohen, J., Bunel, R., Balle, B., Cemgil, T., Ahmed, Z., Stacpoole, K., … Kohli, P. (2024). Scalable watermarking for identifying large language model outputs. Nature, 634(8035), 818–823. https://doi.org/10.1038/s41586-024-08025-4
- DeepSeek-AI, Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z. F., Gou, Z., Shao, Z., Li, Z., Gao, Z., … Zhang, Z. (2025). DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. https://doi.org/10.48550/arXiv.2501.12948
- Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. https://doi.org/10.1109/CVPR.2009.5206848
- Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU
- 電子情報通信学会. (2019). 7群-7編-1章 エージェントの定義・モデル・概念. In 電子情報通信学会知識ベース (ver.1 ed., Number 7群7編). https://www.ieice-hbkb.org/files/ad_base/view_pdf.html?p=/files/07/07gun_07hen_01.pdf
- Distributional Hypothesis - ACL Wiki. Retrieved June 22, 2025, from https://www.aclweb.org/aclwiki/index.php?title=Distributional_Hypothesis
- Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
- drhead. (2024). The VAE used for Stable Diffusion 1.x/2.x and other models (KL-F8) has a critical flaw, probably due to bad training, that is holding back all models that use it (almost certainly including DALL-E 3). [Reddit Post]. r/StableDiffusion. https://www.reddit.com/r/StableDiffusion/comments/1ag5h5s/the_vae_used_for_stable_diffusion_1x2x_and_other/
- The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
- Dutta, D., Ansari, F., Chakrabarty, A., & Das, S. (2025). On the existence of universal simulators of attention. https://doi.org/10.48550/arXiv.2506.18739
- Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
- Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: an early look at the labor market impact potential of large language models. https://doi.org/10.48550/arXiv.2303.10130
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2024). GPTs are GPTs: Labor market impact potential of LLMs. Science, 384(6702), 1306–1308. https://doi.org/10.1126/science.adj0998
- Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
- Entezari, R., Wortsman, M., Saukh, O., Shariatnia, M. M., Sedghi, H., & Schmidt, L. (2023). The role of pre-training data in transfer learning. https://doi.org/10.48550/arXiv.2302.13602
- Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining.
- 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
- Fedus, W., Zoph, B., & Shazeer, N. (2022). Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23(1), 120:5232–5120:5270.
- Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2021). Sharpness-Aware Minimization for Efficiently Improving Generalization. https://doi.org/10.48550/arXiv.2010.01412
- Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
- Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning (Vol. 1). Springer series in statistics New York. http://statweb.stanford.edu/ tibs/book/preface.ps
- 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/
- Fu, L., Yang, B., Kuang, Z., Song, J., Li, Y., Zhu, L., Luo, Q., Wang, X., Lu, H., Huang, M., Li, Z., Tang, G., Shan, B., Lin, C., Liu, Q., Wu, B., Feng, H., Liu, H., Huang, C., … Bai, X. (2024). OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning (Version 1). https://doi.org/10.48550/arXiv.2501.00321
- Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018, September 27). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. https://openreview.net/forum?id=Bygh9j09KX
- Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z
- Gemini Team, Google. (2025). Gemini 2.5: pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities. https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf
- Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
- Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
- Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826. https://doi.org/10.1073/pnas.122653799
- Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494), 746–762. https://doi.org/10.1198/jasa.2011.r10138
- Goldblum, M., Finzi, M., Rowan, K., & Wilson, A. G. (2024). Position: the no free lunch theorem, Kolmogorov complexity, and the role of inductive biases in machine learning. Proceedings of the 41st International Conference on Machine Learning (Position Paper Track), 235, 15788–15808.
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. http://www.deeplearningbook.org
- Gottesman, Y. (2023). Understand diffusion models with VAEs. Yoni Gottesman. https://yonigottesman.github.io/2023/03/11/vae.html
- Guan, M. Y., Joglekar, M., Wallace, E., Jain, S., Barak, B., Helyar, A., Dias, R., Vallone, A., Ren, H., Wei, J., Chung, H. W., Toyer, S., Heidecke, J., Beutel, A., & Glaese, A. (2025). Deliberative alignment: reasoning enables safer language models. https://doi.org/10.48550/arXiv.2412.16339
- Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. Proceedings of the 34th International Conference on Machine Learning - Volume 70, 1321–1330.
- Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
- Hayou, S., Ghosh, N., & Yu, B. (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proceedings of the 41st International Conference on Machine Learning, 17783–17806. https://proceedings.mlr.press/v235/hayou24a.html
- 横井 祥. (2024). 「確率的なオウム」にできること、またそれがなぜできるのかについて. https://speakerdeck.com/eumesy/language-models-as-modern-version-of-the-use-theory-of-meaning
- Henighan, T., Kaplan, J., Katz, M., Chen, M., Hesse, C., Jackson, J., Jun, H., Brown, T. B., Dhariwal, P., Gray, S., Hallacy, C., Mann, B., Radford, A., Ramesh, A., Ryder, N., Ziegler, D. M., Schulman, J., Amodei, D., & McCandlish, S. (2020). Scaling laws for autoregressive generative modeling. https://doi.org/10.48550/arXiv.2010.14701
- Hermann, K., Mobahi, H., Fel, T., & Mozer, M. C. (2023, October 13). On the foundations of shortcut learning. https://openreview.net/forum?id=Tj3xLVuE9f
- Hernandez, D., Kaplan, J., Henighan, T., & McCandlish, S. (2021). Scaling laws for transfer. https://doi.org/10.48550/arXiv.2102.01293
- Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. https://doi.org/10.48550/arXiv.1207.0580
- Ho, J., Jain, A., & Abbeel, P. Denoising diffusion probabilistic models.
- An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
- Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
- Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2019, September 25). The curious case of neural text degeneration. https://openreview.net/forum?id=rygGQyrFvH
- Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2022). LoRA: Low-rank adaptation of large language models. International Conference on Learning Representations. https://openreview.net/forum?id=nZeVKeeFYf9
- Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
- Hutchins, J. (1995). "The whisky was invisible", or Persistent myths of MT. MT News International, 11. https://web.archive.org/web/20210103041306/http://www.hutchinsweb.me.uk/MTNI-11-1995.pdf
- Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1), 2. https://doi.org/10.3390/technologies9010002
- James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An Introduction to Statistical Learning, with Applications in Python. Springer.
- Jin, M., Yu, Q., Shu, D., Zhao, H., Hua, W., Meng, Y., Zhang, Y., & Du, M. (2024). The impact of reasoning step length on large language models (L.-W. Ku, A. Martins, & V. Srikumar, Eds.; pp. 1830–1842). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.108
- 金森 敬文. (2015). 統計的学習理論. 講談社.
- Jocher, G., Qiu, J., & Chaurasia, A. (2023). Ultralytics YOLO (Version 8.0.0). https://github.com/ultralytics/ultralytics
- John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
- Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
- Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2
- Kallini, J., Papadimitriou, I., Futrell, R., Mahowald, K., & Potts, C. (2024). Mission: Impossible language models. In L.-W. Ku, A. Martins, & V. Srikumar (Eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 14691–14714). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-long.787
- Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
- Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
- Khan, F. (2025). FareedKhan-dev/train-deepseek-r1. https://github.com/FareedKhan-dev/train-deepseek-r1
- Kim, J. (2025). kjsman/stable-diffusion-pytorch. https://github.com/kjsman/stable-diffusion-pytorch
- Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
- Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
- Koenker, R. (2005). Quantile Regression (Number 38). Cambridge University Press.
- Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643
- Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
- Lambert, N., Morrison, J., Pyatkin, V., Huang, S., Ivison, H., Brahman, F., Miranda, L. J. V., Liu, A., Dziri, N., Lyu, S., Gu, Y., Malik, S., Graf, V., Hwang, J. D., Yang, J., Bras, R. L., Tafjord, O., Wilhelm, C., Soldaini, L., … Hajishirzi, H. (2025). Tulu 3: Pushing Frontiers in Open Language Model Post-Training. https://doi.org/10.48550/arXiv.2411.15124
- Learn Prompting: Your Guide to Communicating with AI. (2023). https://learnprompting.org/
- Lee, K.-H., Fischer, I., Wu, Y.-H., Marwood, D., Baluja, S., Schuurmans, D., & Chen, X. (2025). Evolving deeper LLM thinking. https://doi.org/10.48550/arXiv.2501.09891
- Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915
- Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
- Liu, R., Gao, J., Zhao, J., Zhang, K., Li, X., Qi, B., Ouyang, W., & Zhou, B. (2025). Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling. https://doi.org/10.48550/arXiv.2502.06703
- Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
- Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
- Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018). Visualizing the loss landscape of neural nets. Advances in Neural Information Processing Systems, 31. https://proceedings.neurips.cc/paper_files/paper/2018/hash/a41b3bb3e6b050b6c9067c67f663b915-Abstract.html
- Lu, Y., & Morgan, J. L. (2020). Homophone auditory processing in cross-linguistic perspective. Proceedings of the Linguistic Society of America, 5(1), 529–542. https://doi.org/10.3765/plsa.v5i1.4733
- Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. https://doi.org/10.48550/arXiv.1603.09320
- Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT
- Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, 2, 3111–3119.
- Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
- Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
- Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
- Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of Machine Learning (2nd ed.). The MIT Press.
- Molnar, C. Interpretable Machine Learning. Retrieved February 23, 2025, from https://christophm.github.io/interpretable-ml-book/
- Molnar, C. Interpretable Machine Learning(邦訳). Retrieved February 23, 2025, from https://hacarus.github.io/interpretable-ml-book-ja/index.html
- Muennighoff, N., Yang, Z., Shi, W., Li, X. L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E., & Hashimoto, T. (2025). s1: Simple test-time scaling. https://doi.org/10.48550/arXiv.2501.19393
- Muennighoff, N., Rush, A. M., Barak, B., Scao, T. L., Tazi, N., Piktus, A., Pyysalo, S., Wolf, T., & Raffel, C. (2023, November 2). Scaling data-constrained language models. https://openreview.net/forum?id=j5BuTrEj35
- Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P. H. S., & Dokania, P. K. (2020). Calibrating deep neural networks using focal loss. 15288–15299.
- Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406
- Newman, M. E. J. (2004). Fast algorithm for detecting community structure in networks. Physical Review E, 69(6), 066133. https://doi.org/10.1103/PhysRevE.69.066133
- Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113. https://doi.org/10.1103/PhysRevE.69.026113
- Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
- Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
- No Free Lunch Theorems. Retrieved February 15, 2025, from http://www.no-free-lunch.org/
- Odena, A., Dumoulin, V., & Olah, C. (2016). Deconvolution and checkerboard artifacts. Distill, 1(10), e3. https://doi.org/10.23915/distill.00003
- Oikarinen, T., & Weng, T.-W. (2022, September 29). CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks. https://openreview.net/forum?id=iPWiwWHc1V
- OpenAI Platform. Retrieved August 29, 2024, from https://platform.openai.com
- OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
- OpenAI. (2024). Learning to reason with LLMs. https://openai.com/index/learning-to-reason-with-llms/
- Osband, K. H. (1985). Providing Incentives for Better Cost Forecasting [Phdthesis]. University of California, Berkeley.
- Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
- Parry, R. M., Jones, W., Stokes, T. H., Phan, J. H., Moffitt, R. A., Fang, H., Shi, L., Oberthuer, A., Fischer, M., Tong, W., & Wang, M. D. (2010). k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. The Pharmacogenomics Journal, 10(4), 292–309. https://doi.org/10.1038/tpj.2010.56
- Pelletier, B. (2024). On the statistical properties of the isolation forest anomaly detection method. https://hal.science/hal-04430185
- 坪井 祐太, 海野 裕也, & 鈴木 潤. (2017). 深層学習による自然言語処理. 講談社.
- Plesner, A., Vontobel, T., & Wattenhofer, R. (2024). Breaking reCAPTCHAv2. https://doi.org/10.1109/COMPSAC61105.2024.00142
- Murphy, K. P. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. https://probml.github.io/pml-book/book1.html
- Prompt Engineering Guide – Nextra. Retrieved August 29, 2024, from https://www.promptingguide.ai/
- Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learning, 8748–8763. https://proceedings.mlr.press/v139/radford21a.html
- Rafailov, R., Hejna, J., Park, R., & Finn, C. (2024, August 26). From \r to {Q^*\: Your Language Model is Secretly a Q-Function. https://openreview.net/forum?id=kEVcNxtqXk
- Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
- Raghu, M., Zhang, C., Kleinberg, J., & Bengio, S. (2019). Transfusion: understanding transfer learning for medical imaging. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 301, pp. 3347–3357). Curran Associates Inc.
- Raschka, S., & Mirjalili, V. (2020). Python機械学習プログラミング:達人データサイエンティストによる理論と実践 (福島 真太朗 & 株式会社クイープ, Trans.; 第3版). インプレス.
- Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
- Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
- A generalist agent. (2022). Transactions on Machine Learning Research. https://openreview.net/forum?id=1ikK0kHjvj
- Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr., 3(4), 333–389. https://doi.org/10.1561/1500000019
- Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. 10674–10685. https://doi.org/10.1109/CVPR52688.2022.01042
- Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (pp. 234–241). Springer International Publishing. https://doi.org/10.1007/978-3-319-24574-4_28
- R, F. I. R. T. H. J. (1957). A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis. https://cir.nii.ac.jp/crid/1570854175539816192?lang=en
- Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252. https://doi.org/10.1007/s11263-015-0816-y
- Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
- Sardana, N., Portes, J., Doubov, S., & Frankle, J. (2024). Beyond Chinchilla-optimal: accounting for inference in language model scaling laws. Proceedings of the 41st International Conference on Machine Learning, 235, 43445–43460.
- Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
- Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst., 42(3), 19:1–19:21. https://doi.org/10.1145/3068335
- Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., Li, Y., Gupta, A., Han, H. J., Schulhoff, S., Dulepet, P. S., Vidyadhara, S., Ki, D., Agrawal, S., Pham, C., Kroiz, G., Li, F., Tao, H., Srivastava, A., … Resnik, P. (2024). The Prompt Report: A Systematic Survey of Prompting Techniques. http://arxiv.org/abs/2406.06608
- Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
- Seber, G. A. F., & Lee, A. J. (2003). Linear Regression Analysis (2nd ed.). Wiley.
- Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. https://www.cs.huji.ac.il/ shais/UnderstandingMachineLearning/copy.html
- 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
- 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
- Sharif, M., Bhagavatula, S., Bauer, L., & Reiter, M. K. (2016). Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 1528–1540. https://doi.org/10.1145/2976749.2978392
- 石井 健一郎, & 上田 修功. (2014). 教師なし学習入門 (わかりやすいパターン認識 続). オーム社.
- Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. R., & Yao, S. (2023, November 2). Reflexion: language agents with verbal reinforcement learning. https://openreview.net/forum?id=vAElhFcKW6
- SIGKDD News: 2014 SIGKDD Test of Time Award. (2014). https://www.kdd.org/News/view/2014-sigkdd-test-of-time-award
- Singhal, S., Zeng, J., Bukharin, A., Zhang, Y., Shen, G., Mahabaleshwarkar, A. S., Kartal, B., Suhara, Y., Bercovich, A., Levy, I., Golan, I., Dabbah, M., El-Yaniv, R., Majumdar, S., Gitman, I., Bakhturina, E., Zhang, J. J., Su, B.-Y., Huang, G., … Konuk, T. (2025, June 21). Llama-Nemotron: Efficient Reasoning Models. https://openreview.net/forum?id=ev1xpo9mbI&referrer=%5Bthe%20profile%20of%20Olivier%20Delalleau%5D(%2Fprofile%3Fid%3D Olivier_Delalleau1)
- Snell, C. V., Lee, J., Xu, K., & Kumar, A. (2024, October 4). Scaling LLM test-time compute optimally can be more effective than scaling parameters for reasoning. https://openreview.net/forum?id=4FWAwZtd2n
- Song, Y., & Ermon, S. (2019). Generative modeling by estimating gradients of the data distribution. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1067, pp. 11918–11930). Curran Associates Inc.
- Song, Y. (2021). Generative modeling by estimating gradients of the data distribution. Yang Song. https://yang-song.net/blog/2021/score
- Song, Y., Durkan, C., Murray, I., & Ermon, S. (2021, November 9). Maximum likelihood training of score-based diffusion models. https://openreview.net/forum?id=AklttWFnxS9
- Speech and Language Processing. Retrieved June 18, 2025, from https://web.stanford.edu/ jurafsky/slp3/
- Steinwart, I., Pasin, C., Williamson, R., & Zhang, S. (2014). Elicitation and identification of properties. Proceedings of The 27th Conference on Learning Theory, 482–526. https://proceedings.mlr.press/v35/steinwart14.html
- Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
- Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., & Liu, Y. (2024). RoFormer: enhanced transformer with rotary position embedding. Neurocomputing, 568(C). https://doi.org/10.1016/j.neucom.2023.127063
- Szeliski, R. (2022). Computer Vision: Algorithms and Applications. Springer. https://szeliski.org/Book/
- Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
- Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
- THU-MIG/yolov10. (2024). THU-MIG. https://github.com/THU-MIG/yolov10
- Todeschini, R., Ballabio, D., & Consonni, V. (2020). Distances and similarity measures in chemometrics and chemoinformatics. In Encyclopedia of Analytical Chemistry (pp. 1–40). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470027318.a9438.pub2
- Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
- Ultralytics. Ultralytics YOLO11 object detection model. Retrieved June 12, 2025, from https://github.com/ultralytics/ultralytics/blob/da98efc61d9e0467315fc86c2297c8d81e656b1a/ultralytics/cfg/models/11/yolo11.yaml
- Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
- Neural discrete representation learning. (2017). Proceedings of the 31st International Conference on Neural Information Processing Systems, 6309–6318.
- Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. Proceedings of the 24th International Conference on Machine Learning, 935–942. https://doi.org/10.1145/1273496.1273614
- Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer New York. https://doi.org/10.1007/978-1-4757-3264-1
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000–6010.
- Vincent, P. (2011). A connection between score matching and denoising autoencoders. Neural Computation, 23(7), 1661–1674. https://doi.org/10.1162/NECO_a_00142
- Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
- Vovk, V., Gammerman, A., & Shafer, G. (2022). Algorithmic Learning in a Random World. Springer International Publishing. https://doi.org/10.1007/978-3-031-06649-8
- Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2011). Class imbalance, redux. IEEE 11th International Conference on Data Mining, 754–763. https://doi.org/10.1109/ICDM.2011.33
- Wang, X., & Zhou, D. (2024, November 6). Chain-of-Thought Reasoning Without Prompting. https://openreview.net/forum?id=4Zt7S0B0Jp
- Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (T. Linzen, G. Chrupała, & A. Alishahi, Eds.; pp. 353–355). Association for Computational Linguistics. https://doi.org/10.18653/v1/W18-5446
- Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387
- Wang, Y., Yang, Q., Zeng, Z., Ren, L., Liu, L., Peng, B., Cheng, H., He, X., Wang, K., Gao, J., Chen, W., Wang, S., Du, S. S., & Shen, Y. (2025). Reinforcement learning for reasoning in large language models with one training example. https://doi.org/10.48550/arXiv.2504.20571
- Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. https://doi.org/10.48550/arXiv.2405.14458
- Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. Advances in Neural Information Processing Systems, 37, 107984–108011. https://proceedings.neurips.cc/paper_files/paper/2024/hash/c34ddd05eb089991f06f3c5dc36836e0-Abstract-Conference.html
- Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845
- Warner, B., Chaffin, A., Clavié, B., Weller, O., Hallström, O., Taghadouini, S., Gallagher, A., Biswas, R., Ladhak, F., Aarsen, T., Cooper, N., Adams, G., Howard, J., & Poli, I. (2024). Smarter, better, faster, longer: a modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference. https://doi.org/10.48550/arXiv.2412.13663
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.
- Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
- Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ
- Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb), 207–244. http://www.jmlr.org/papers/v10/weinberger09a.html
- Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.
- Wen, X., Liu, Z., Zheng, S., Xu, Z., Ye, S., Wu, Z., Liang, X., Wang, Y., Li, J., Miao, Z., Bian, J., & Yang, M. (2025). Reinforcement learning with verifiable rewards implicitly incentivizes correct reasoning in base LLMs. https://doi.org/10.48550/arXiv.2506.14245
- Wettig, A., Gao, T., Zhong, Z., & Chen, D. (2023). Should you mask 15% in masked language modeling? In A. Vlachos & I. Augenstein (Eds.), Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (pp. 2985–3000). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.eacl-main.217
- Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
- Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
- Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3
- Wu, Y., & He, K. (2018). Group normalization. Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XIII, 3–19. https://doi.org/10.1007/978-3-030-01261-8_1
- Wu, T., Lan, J., Yuan, W., Jiao, J., Weston, J. E., & Sukhbaatar, S. (2025, June 18). Thinking LLMs: general instruction following with thought generation. https://openreview.net/forum?id=z6SrgYCdey¬eId=t3y0Ev0lm6
- 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
- Xiong, R., Yang, Y., He, D., Zheng, K., Zheng, S., Xing, C., Zhang, H., Lan, Y., Wang, L., & Liu, T.-Y. (2020). On layer normalization in the transformer architecture. Proceedings of the 37th International Conference on Machine Learning, 119, 10524–10533.
- Xu, D., & Tian, Y. (2015). A Comprehensive Survey of Clustering Algorithms. Annals of Data Science, 2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1
- Xu, H., Xie, S., Tan, X., Huang, P.-Y., Howes, R., Sharma, V., Li, S.-W., Ghosh, G., Zettlemoyer, L., & Feichtenhofer, C. (2023, October 13). Demystifying CLIP data. https://openreview.net/forum?id=5BCFlnfE1g
- Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
- Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
- Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/
- 岩田 具治. (2015). トピックモデル. 講談社.
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
- Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. R. (2023, November 2). Tree of Thoughts: deliberate problem solving with large language models. https://openreview.net/forum?id=5Xc1ecxO1h
- 原田 達也. (2017). 画像認識. 講談社. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003372347
- Yue, Z., Zhuang, H., Bai, A., Hui, K., Jagerman, R., Zeng, H., Qin, Z., Wang, D., Wang, X., & Bendersky, M. (2024, October 4). Inference Scaling for Long-Context Retrieval Augmented Generation. https://openreview.net/forum?id=FSjIrOm1vz
- Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33(4), 452–473. http://www.jstor.org/stable/3629752
- Zelikman, E., Harik, G. R., Shao, Y., Jayasiri, V., Haber, N., & Goodman, N. (2024, August 26). Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking. https://openreview.net/forum?id=oRXPiSOGH9
- Zelikman, E., Wu, Y., Mu, J., & Goodman, N. (2022, October 31). STaR: bootstrapping reasoning with reasoning. https://openreview.net/forum?id=_3ELRdg2sgI
- Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
- 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.
- Zhang, Y., & Teng, Z. (2021). Natural Language Processing: A Machine Learning Perspective. Cambridge University Press.
- Zhang, B., & Sennrich, R. (2019). Root mean square layer normalization. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1110, pp. 12381–12392). Curran Associates Inc.
- Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. https://doi.org/10.48550/arXiv.1611.03530
- Zhao, X., Kang, Z., Feng, A., Levine, S., & Song, D. (2025). Learning to reason without external rewards. https://doi.org/10.48550/arXiv.2505.19590
- Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao
- Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he
- Zhou, Z.-H. (2021). Machine Learning. Springer Singapore. https://doi.org/10.1007/978-981-15-1967-3
- Zhou, C., Yu, L., Babu, A., Tirumala, K., Yasunaga, M., Shamis, L., Kahn, J., Ma, X., Zettlemoyer, L., & Levy, O. (2024, October 4). Transfusion: predict the next token and diffuse images with one multi-modal model. https://openreview.net/forum?id=SI2hI0frk6
- 周志 华. (2022). 機械学習 (大和田勇 人, 玄光 男, 下川朝 有, & 郝新 厂, Trans.). 近代科学社.
用語集 (.bib)
- Kotz, S., Balakrishnan, N., Read, C. B., & Vidakovic, B. (Eds.). (2006). Encyclopedia of Statistical Sciences (2nd ed). Wiley-Interscience.
- Sammut, C., & Webb, G. I. (Eds.). (2017). Encyclopedia of Machine Learning and Data Mining. Springer US. https://doi.org/10.1007/978-1-4899-7687-1
科学哲学-帰納バイアス (.bib)
- Baker, A. (2022). Simplicity. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2022). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2022/entries/simplicity/
科学哲学 (.bib)
- Baker, A. (2022). Simplicity. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2022). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2022/entries/simplicity/
- Bell, E., Bryman, A., & Harley, B. (2022). Business Research Methods (6th ed.). Oxford Univ Press.
経営学の方法論-データ収集 (.bib)
- Hand, M. (2014). From Cyberspace to the Dataverse: Trajectories in Digital Social Research. In Big Data? Qualitative Approaches to Digital Research (Vol. 13, pp. 1–27). Emerald Group Publishing Limited. https://doi.org/10.1108/S1042-319220140000013002
- Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1). https://doi.org/10.1177/2053951714528481
経営学の方法論-研究事例 (.bib)
- Chen, C. C., & Meindl, J. R. (1991). The construction of leadership images in the popular press: the case of Donald Burr and People Express. Administrative Science Quarterly, 36(4), 521. https://doi.org/10.2307/2393273
- Webb, E. J., Campbell, D. T., Schwartz, R. D., & Sechrest, L. (1966). Unobtrusive Measures: Nonreactive Research in the Social Sciences (pp. xii, 225). Rand Mcnally.
- Zamanou, S., & Glaser, S. R. (1994). Moving toward participation and involvement: Managing and measuring organizational culture. Group & Organization Management, 19(4), 475–502. https://doi.org/10.1177/1059601194194005
経営学の方法論 (.bib)
- Bell, E., Bryman, A., & Harley, B. (2022). Business Research Methods (6th ed.). Oxford Univ Press.
- Chen, C. C., & Meindl, J. R. (1991). The construction of leadership images in the popular press: the case of Donald Burr and People Express. Administrative Science Quarterly, 36(4), 521. https://doi.org/10.2307/2393273
- Creswell, J. W., & Plano Clark, V. L. (2017). Designing and Conducting Mixed Methods Research (3rd ed.). SAGE Publications.
- Hand, M. (2014). From Cyberspace to the Dataverse: Trajectories in Digital Social Research. In Big Data? Qualitative Approaches to Digital Research (Vol. 13, pp. 1–27). Emerald Group Publishing Limited. https://doi.org/10.1108/S1042-319220140000013002
- Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1). https://doi.org/10.1177/2053951714528481
- Webb, E. J., Campbell, D. T., Schwartz, R. D., & Sechrest, L. (1966). Unobtrusive Measures: Nonreactive Research in the Social Sciences (pp. xii, 225). Rand Mcnally.
- Zamanou, S., & Glaser, S. R. (1994). Moving toward participation and involvement: Managing and measuring organizational culture. Group & Organization Management, 19(4), 475–502. https://doi.org/10.1177/1059601194194005
提供参考文献 (.bib)
- 100,000 H100 clusters: power, network topology, ethernet vs infiniband, reliability, failures, checkpointing. (2024). SemiAnalysis. https://semianalysis.com/2024/06/17/100000-h100-clusters-power-network/
- 1,000億パラメータの独自LLM「PLaMo-100B」の事後学習が完了. (2024). Preferred Networks Research & Development. https://tech.preferred.jp/ja/blog/plamo-100b-post-training/
- 7群7編 分散協調とエージェント – 電子情報通信学会知識ベース. Retrieved July 5, 2025, from https://www.ieice-hbkb.org/portal/7%e7%be%a4%e3%80%80%e3%82%b3%e3%83%b3%e3%83%94%e3%83%a5%e3%83%bc%e3%82%bf-%ef%bd%bf%ef%be%8c%ef%be%84%ef%bd%b3%ef%bd%aa%ef%bd%b1/7%e7%be%a47%e7%b7%a8-%e5%88%86%e6%95%a3%e5%8d%94%e8%aa%bf%e3%81%a8%e3%82%a8%e3%83%bc%e3%82%b8%e3%82%a7%e3%83%b3%e3%83%88/
- Abu Alfeilat, H. A., Hassanat, A. B. A., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., Eyal Salman, H. S., & Prasath, V. B. S. (2019). Effects of distance measure choice on K-nearest neighbor classifier performance: a review. Big Data, 7(4), 221–248. https://doi.org/10.1089/big.2018.0175
- Aggarwal, C. C., Hinneburg, A., & Keim, D. A. (2001). On the surprising behavior of distance metrics in high dimensional spaces. Proceedings of the 8th International Conference on Database Theory, 420–434.
- Ahmad, A., & Khan, S. S. (2019). Survey of state-of-the-art mixed data clustering algorithms. IEEE Access, 7, 31883–31902. https://doi.org/10.1109/ACCESS.2019.2903568
- Ali, M., Fromm, M., Thellmann, K., Rutmann, R., Lübbering, M., Leveling, J., Klug, K., Ebert, J., Doll, N., Buschhoff, J., Jain, C., Weber, A., Jurkschat, L., Abdelwahab, H., John, C., Ortiz Suarez, P., Ostendorff, M., Weinbach, S., Sifa, R., … Flores-Herr, N. (2024). Tokenizer choice for LLM training: negligible or crucial? In K. Duh, H. Gomez, & S. Bethard (Eds.), Findings of the Association for Computational Linguistics: NAACL 2024 (pp. 3907–3924). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-naacl.247
- Angelopoulos, A. N., & Bates, S. (2022). A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification. http://arxiv.org/abs/2107.07511
- Angwin, J., Larson, J., Mattu, S., Kirchner, L., & ProPublica. (2016). Machine Bias. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
- Arnett, C., Chang, T. A., & Bergen, B. (2024). A Bit of a Problem: Measurement Disparities in Dataset Sizes across Languages. In M. Melero, S. Sakti, & C. Soria (Eds.), Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024 (pp. 1–9). ELRA and ICCL. https://aclanthology.org/2024.sigul-1.1/
- Arnett, C., & Bergen, B. (2025). Why do language models perform worse for morphologically complex languages? In O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, & S. Schockaert (Eds.), Proceedings of the 31st International Conference on Computational Linguistics (pp. 6607–6623). Association for Computational Linguistics. https://aclanthology.org/2025.coling-main.441/
- Arora, S., Liang, Y., & Ma, T. (2017, February 6). A simple but tough-to-beat baseline for sentence embeddings. https://openreview.net/forum?id=SyK00v5xx
- Arthur, D., & Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027–1035.
- Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N., Elhage, N., Hatfield-Dodds, Z., Hernandez, D., Kernion, J., Ndousse, K., Olsson, C., Amodei, D., Brown, T., Clark, J., … Kaplan, J. (2021). A general language assistant as a laboratory for alignment. https://doi.org/10.48550/arXiv.2112.00861
- Aumüller, M., Bernhardsson, E., & Faithfull, A. (2018). ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. https://doi.org/10.48550/arXiv.1807.05614
- Azevedo, F. A. C., Carvalho, L. R. B., Grinberg, L. T., Farfel, J. M., Ferretti, R. E. L., Leite, R. E. P., Filho, W. J., Lent, R., & Herculano-Houzel, S. (2009). Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. Journal of Comparative Neurology, 513(5), 532–541. https://doi.org/10.1002/cne.21974
- Bach, F. (2024). Learning Theory from First Principles. The MIT Press.
- Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Y. Bengio & Y. LeCun (Eds.), Conference Track Proceedings of the 3rd International Conference on Learning Representations. http://arxiv.org/abs/1409.0473
- Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., … Kaplan, J. (2022). Constitutional AI: harmlessness from AI feedback. https://doi.org/10.48550/arXiv.2212.08073
- Baker, A. (2022). Simplicity. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2022). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2022/entries/simplicity/
- 北川 源四郎, 竹村 彰通, 赤穂 昭太郎, 今泉 允聡, 内田 誠一, 清 智也, 高野 渉, 辻 真吾, 原 尚幸, 久野 遼平, 松原 仁, 宮地 充子, 森畑 明昌, & 宿久 洋. (2023). 応用基礎としてのデータサイエンス AI×データ活用の実践. 講談社.
- Belkin, M., Rakhlin, A., & Tsybakov, A. B. (2019). Does data interpolation contradict statistical optimality? The 22nd International Conference on Artificial Intelligence and Statistics, 1611–1619. http://proceedings.mlr.press/v89/belkin19a.html
- Bell, E., Bryman, A., & Harley, B. (2022). Business Research Methods (6th ed.). Oxford Univ Press.
- Be My Eyes (Ed.). (2024). Be My Eyes Accessibility with GPT-4o. https://www.youtube.com/watch?v=Zq710AKC1gg
- Bergstra, J., & Bengio, Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13.
- Bloomberg. (2024). Generative AI 2024 report: assessing opportunities and disruptions in an evolving trillion-dollar market. https://www.bloomberg.com/professional/products/bloomberg-terminal/research/bloomberg-intelligence/download/generative-ai-2024-report/
- Bouman, R., Bukhsh, Z., & Heskes, T. (2024). Unsupervised anomaly detection algorithms on real-world data: how many do we need? Journal of Machine Learning Research, 25, 1–34.
- Bridle, J. S. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. F. Soulié & J. Hérault (Eds.), Neurocomputing (pp. 227–236). Springer. https://doi.org/10.1007/978-3-642-76153-9_28
- Brodersen, K. H., Ong, C. S., Stephan, K. E., & Buhmann, J. M. (2010). The balanced accuracy and its posterior distribution. 2010 20th International Conference on Pattern Recognition, 3121–3124. https://doi.org/10.1109/ICPR.2010.764
- Brynjolfsson, E., Li, D., & Raymond, L. R. (2023). Generative AI at work (Number 31161) [Working Paper]. https://doi.org/10.3386/w31161
- Brynjolfsson, E., Mitchell, T., & Rock, D. (2018). What can machines learn, and what does it mean for occupations and the economy? AEA Papers and Proceedings, 108, 43–47. https://doi.org/10.1257/pandp.20181019
- Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T., & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. https://doi.org/10.48550/arXiv.2303.12712
- C. Fraley, & A. Raftery. (2000). Model-based clustering, discriminant analysis, and density estimation (Technical Report No.380; Number 380). Department of Statistics, University of Washington. https://csss.uw.edu/files/working-papers/2000/wp11.pdf
- Campbell, N. D. F., & Kautz, J. (2014). Learning a manifold of fonts. ACM Transactions on Graphics, 33(4), 1–11. https://doi.org/10.1145/2601097.2601212
- Cao, F., Liang, J., & Bai, L. (2009). A new initialization method for categorical data clustering. Expert Systems with Applications, 36(7), 10223–10228. https://doi.org/10.1016/j.eswa.2009.01.060
- Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022, September 29). Quantifying memorization across neural language models. https://openreview.net/forum?id=TatRHT_1cK
- Cazzaniga, M. (2024). Gen-AI: artificial intelligence and the future of work. Staff Discussion Notes, 2024(001), 1. https://doi.org/10.5089/9798400262548.006
- Chan, C., Ginosar, S., Zhou, T., & Efros, A. (2019). Everybody dance now. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 5932–5941. https://doi.org/10.1109/ICCV.2019.00603
- Chapagain, D., Kshetri, N., & Aryal, B. (2024). Deepfake disasters: a comprehensive review of technology, ethical concerns, countermeasures, and societal implications. 2024 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), 1–9. https://doi.org/10.1109/ETNCC63262.2024.10767452
- Chen, C. C., & Meindl, J. R. (1991). The construction of leadership images in the popular press: the case of Donald Burr and People Express. Administrative Science Quarterly, 36(4), 521. https://doi.org/10.2307/2393273
- Chen, S., Wong, S., Chen, L., & Tian, Y. (2023). Extending context window of large language models via positional interpolation. https://doi.org/10.48550/arXiv.2306.15595
- Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html
- 赤穂 昭太郎. (2008). カーネル多変量解析―非線形データ解析の新しい展開. 岩波書店.
- Chouldechova, A. (2017). Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data, 5(2), 153–163. https://doi.org/10.1089/big.2016.0047
- Çinlar, E. (2011). Probability and Stochastics. Springer New York. https://doi.org/10.1007/978-0-387-87859-1
- Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111. https://doi.org/10.1103/PhysRevE.70.066111
- A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. (2022). Engineering Applications of Artificial Intelligence, 110, 104743. https://doi.org/10.1016/j.engappai.2022.104743
- Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., & Huq, A. (2017). Algorithmic decision making and the cost of fairness. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 797–806. https://doi.org/10.1145/3097983.3098095
- Cover, T. M., & Thomas, J. A. (2006). Elements of Information Theory (2nd ed). Wiley-Interscience.
- Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
- Creswell, J. W., & Plano Clark, V. L. (2017). Designing and Conducting Mixed Methods Research (3rd ed.). SAGE Publications.
- Cui, Z. (K., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The effects of generative AI on high-skilled work: evidence from three field experiments with software developers (Number 4945566) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4945566
- 村上 正康, 稲葉 尚志, & 野沢 宗平. (1989). 演習 線形代数 (改訂版). 培風館.
- Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 1, 886–893 vol. 1. https://doi.org/10.1109/CVPR.2005.177
- Dathathri, S., See, A., Ghaisas, S., Huang, P.-S., McAdam, R., Welbl, J., Bachani, V., Kaskasoli, A., Stanforth, R., Matejovicova, T., Hayes, J., Vyas, N., Merey, M. A., Brown-Cohen, J., Bunel, R., Balle, B., Cemgil, T., Ahmed, Z., Stacpoole, K., … Kohli, P. (2024). Scalable watermarking for identifying large language model outputs. Nature, 634(8035), 818–823. https://doi.org/10.1038/s41586-024-08025-4
- De Cock, D. (2011). Ames, Iowa: alternative to the Boston housing data as an end of semester regression project. Journal of Statistics Education, 19(3), 8. https://doi.org/10.1080/10691898.2011.11889627
- DeepETA: How Uber Predicts Arrival Times Using Deep Learning. (2022). Uber Blog. https://www.uber.com/blog/deepeta-how-uber-predicts-arrival-times/
- DeepSeek-AI, Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z. F., Gou, Z., Shao, Z., Li, Z., Gao, Z., … Zhang, Z. (2025). DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning. https://doi.org/10.48550/arXiv.2501.12948
- Defazio, A., Yang, X. A., Khaled, A., Mishchenko, K., Mehta, H., & Cutkosky, A. (2024, November 6). The road less scheduled. https://openreview.net/forum?id=0XeNkkENuI
- delving_2025_july.png. (2025). GitHub. https://github.com/berenslab/llm-excess-vocab/blob/main/figures/post-publication-updates/delving_2025_july.png
- Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. https://doi.org/10.1109/CVPR.2009.5206848
- Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, November 2). QLoRA: Efficient Finetuning of Quantized LLMs. https://openreview.net/forum?id=OUIFPHEgJU
- 電子情報通信学会. (2019). 7群-7編-1章 エージェントの定義・モデル・概念. In 電子情報通信学会知識ベース (ver.1 ed., Number 7群7編). https://www.ieice-hbkb.org/files/ad_base/view_pdf.html?p=/files/07/07gun_07hen_01.pdf
- Distributional Hypothesis - ACL Wiki. Retrieved June 22, 2025, from https://www.aclweb.org/aclwiki/index.php?title=Distributional_Hypothesis
- Dong, H., Xiong, W., Pang, B., Wang, H., Zhao, H., Zhou, Y., Jiang, N., Sahoo, D., Xiong, C., & Zhang, T. (2024). RLHF Workflow: From Reward Modeling to Online RLHF. Transactions on Machine Learning Research. https://openreview.net/forum?id=a13aYUU9eU
- Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazaré, P.-E., Lomeli, M., Hosseini, L., & Jégou, H. (2025). The Faiss library. https://doi.org/10.48550/arXiv.2401.08281
- drhead. (2024). The VAE used for Stable Diffusion 1.x/2.x and other models (KL-F8) has a critical flaw, probably due to bad training, that is holding back all models that use it (almost certainly including DALL-E 3). [Reddit Post]. r/StableDiffusion. https://www.reddit.com/r/StableDiffusion/comments/1ag5h5s/the_vae_used_for_stable_diffusion_1x2x_and_other/
- The Llama 3 herd of models. (2024). https://doi.org/10.48550/arXiv.2407.21783
- Dutta, D., Ansari, F., Chakrabarty, A., & Das, S. (2025). On the existence of universal simulators of attention. https://doi.org/10.48550/arXiv.2506.18739
- Edwards, W. (1982). Conservatism in human information processing. In A. Tversky, D. Kahneman, & P. Slovic (Eds.), Judgment under Uncertainty: Heuristics and Biases (pp. 359–369). Cambridge University Press. https://doi.org/10.1017/CBO9780511809477.026
- Elad, M. (2010). Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer. https://doi.org/10.1007/978-1-4419-7011-4
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: an early look at the labor market impact potential of large language models. https://doi.org/10.48550/arXiv.2303.10130
- Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2024). GPTs are GPTs: Labor market impact potential of LLMs. Science, 384(6702), 1306–1308. https://doi.org/10.1126/science.adj0998
- Emsley, R. (2023). ChatGPT: these are not hallucinations – they’re fabrications and falsifications. Schizophrenia, 9(1), 1–2. https://doi.org/10.1038/s41537-023-00379-4
- Entezari, R., Wortsman, M., Saukh, O., Shariatnia, M. M., Sedghi, H., & Schmidt, L. (2023). The role of pre-training data in transfer learning. https://doi.org/10.48550/arXiv.2302.13602
- Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining.
- 繁桝 算男. (1985). ベイズ統計入門. 東京大学出版会.
- Fedus, W., Zoph, B., & Shazeer, N. (2022). Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res., 23(1), 120:5232–5120:5270.
- Felten, E. W., Raj, M., & Seamans, R. (2023). How will Language Modelers like ChatGPT Affect Occupations and Industries? (Number 4375268) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4375268
- Felten, E. W., Raj, M., & Seamans, R. (2018). A method to link advances in artificial intelligence to occupational abilities. AEA Papers and Proceedings, 108, 54–57. https://doi.org/10.1257/pandp.20181021
- Felten, E. W., Raj, M., & Seamans, R. (2023). Occupational heterogeneity in exposure to Generative AI (Number 4414065) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.4414065
- Fernando, T., Priyasad, D., Sridharan, S., Ross, A., & Fookes, C. (2025). Face deepfakes: a comprehensive review. https://doi.org/10.48550/arXiv.2502.09812
- Filippucci, F., Gal, P., Jona-Lasinio, C., Leandro, A., & Nicoletti, G. (2024). The impact of artificial intelligence on productivity, distribution and growth: key mechanisms, initial evidence and policy challenges (No.15; OECD Artificial Intelligence Papers, Number 15). OECD Publishing. https://doi.org/10.1787/8d900037-en
- Filippucci, F., Gal, P., & Schief, M. (2024). Miracle or myth? Assessing the macroeconomic productivity gains from artificial intelligence (No.29; OECD Artificial Intelligence Papers, Number 29). OECD Publishing. https://doi.org/10.1787/b524a072-en
- Foret, P., Kleiner, A., Mobahi, H., & Neyshabur, B. (2021). Sharpness-Aware Minimization for Efficiently Improving Generalization. https://doi.org/10.48550/arXiv.2010.01412
- Forina, M., Armanino, C., Castino, M., & Ubigli, M. (1986). Multivariate data analysis as a discriminating method of the origin of wines. VITIS - Journal of Grapevine Research, 25(3), 189–189. https://doi.org/10.5073/vitis.1986.25.189-201
- Fraley, C., & Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41(8), 578–588. https://doi.org/10.1093/comjnl/41.8.578
- Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning (Vol. 1). Springer series in statistics New York. http://statweb.stanford.edu/ tibs/book/preface.ps
- 冨岡 亮太. (2015). スパース性に基づく機械学習 = Machine learning with sparsity inducing regularizations (MLP機械学習プロフェッショナルシリーズ) (講談社サイエンティフィク, Ed.). 講談社. https://elib.maruzen.co.jp/elib/html/BookDetail/Id/3000080942/
- Fu, L., Yang, B., Kuang, Z., Song, J., Li, Y., Zhu, L., Luo, Q., Wang, X., Lu, H., Huang, M., Li, Z., Tang, G., Shan, B., Lin, C., Liu, Q., Wu, B., Feng, H., Liu, H., Huang, C., … Bai, X. (2024). OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning (Version 1). https://doi.org/10.48550/arXiv.2501.00321
- 福水 健次. (2010). カーネル法入門―正定値カーネルによるデータ解析. 朝倉書店.
- Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018, September 27). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. https://openreview.net/forum?id=Bygh9j09KX
- Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., & Wichmann, F. A. (2020). Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11), 665–673. https://doi.org/10.1038/s42256-020-00257-z
- Gemini Team, Google. (2025). Gemini 2.5: pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities. https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf
- Ge, D., Jiang, X., & Ye, Y. (2011). A note on the complexity of Lpminimization. Mathematical Programming, 129(2), 285–299. https://doi.org/10.1007/s10107-011-0470-2
- Gershman, S. J. Amortized Inference in Probabilistic Reasoning.
- Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826. https://doi.org/10.1073/pnas.122653799
- Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106(494), 746–762. https://doi.org/10.1198/jasa.2011.r10138
- Goldblum, M., Finzi, M., Rowan, K., & Wilson, A. G. (2024). Position: the no free lunch theorem, Kolmogorov complexity, and the role of inductive biases in machine learning. Proceedings of the 41st International Conference on Machine Learning (Position Paper Track), 235, 15788–15808.
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. http://www.deeplearningbook.org
- Gottesman, Y. (2023). Understand diffusion models with VAEs. Yoni Gottesman. https://yonigottesman.github.io/2023/03/11/vae.html
- GO株式会社. (2023). タクシーアプリ『GO』の データ基盤の全体像. https://www.slideshare.net/slideshow/ss-258369181/258369181
- Guan, M. Y., Joglekar, M., Wallace, E., Jain, S., Barak, B., Helyar, A., Dias, R., Vallone, A., Ren, H., Wei, J., Chung, H. W., Toyer, S., Heidecke, J., Beutel, A., & Glaese, A. (2025). Deliberative alignment: reasoning enables safer language models. https://doi.org/10.48550/arXiv.2412.16339
- Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. Proceedings of the 34th International Conference on Machine Learning - Volume 70, 1321–1330.
- Hall, P., Marron, J. S., & Neeman, A. (2005). Geometric representation of high dimension, low sample size data. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(3), 427–444. https://doi.org/10.1111/j.1467-9868.2005.00510.x
- Handa, K., Tamkin, A., McCain, M., Huang, S., Durmus, E., Heck, S., Mueller, J., Hong, J., Ritchie, S., Belonax, T., Troy, K. K., Amodei, D., Kaplan, J., Clark, J., & Ganguli, D. (2025). Which economic tasks are performed with AI? Evidence from millions of Claude conversations. https://doi.org/10.48550/arXiv.2503.04761
- Hand, M. (2014). From Cyberspace to the Dataverse: Trajectories in Digital Social Research. In Big Data? Qualitative Approaches to Digital Research (Vol. 13, pp. 1–27). Emerald Group Publishing Limited. https://doi.org/10.1108/S1042-319220140000013002
- Hariri, S., Kind, M. C., & Brunner, R. J. (2021). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479–1489. https://doi.org/10.1109/TKDE.2019.2947676
- Hayou, S., Ghosh, N., & Yu, B. (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proceedings of the 41st International Conference on Machine Learning, 17783–17806. https://proceedings.mlr.press/v235/hayou24a.html
- 黒田 成俊. (1980). 関数解析 (Number 15). 共立出版.
- 横井 祥. (2024). 「確率的なオウム」にできること、またそれがなぜできるのかについて. https://speakerdeck.com/eumesy/language-models-as-modern-version-of-the-use-theory-of-meaning
- Henighan, T., Kaplan, J., Katz, M., Chen, M., Hesse, C., Jackson, J., Jun, H., Brown, T. B., Dhariwal, P., Gray, S., Hallacy, C., Mann, B., Radford, A., Ramesh, A., Ryder, N., Ziegler, D. M., Schulman, J., Amodei, D., & McCandlish, S. (2020). Scaling laws for autoregressive generative modeling. https://doi.org/10.48550/arXiv.2010.14701
- Hermann, K., Mobahi, H., Fel, T., & Mozer, M. C. (2023, October 13). On the foundations of shortcut learning. https://openreview.net/forum?id=Tj3xLVuE9f
- Hernandez, D., Kaplan, J., Henighan, T., & McCandlish, S. (2021). Scaling laws for transfer. https://doi.org/10.48550/arXiv.2102.01293
- Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. https://doi.org/10.48550/arXiv.1207.0580
- Ho, J., Jain, A., & Abbeel, P. Denoising diffusion probabilistic models.
- An empirical analysis of compute-optimal large language model training. (2022, October 31). https://openreview.net/forum?id=iBBcRUlOAPR
- Training compute-optimal large language models. (2022). http://arxiv.org/abs/2203.15556
- Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2019, September 25). The curious case of neural text degeneration. https://openreview.net/forum?id=rygGQyrFvH
- Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2022). LoRA: Low-rank adaptation of large language models. International Conference on Learning Representations. https://openreview.net/forum?id=nZeVKeeFYf9
- Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3), 283–304. https://doi.org/10.1023/A:1009769707641
- Hu, K., & Hu, K. (2023). ChatGPT sets record for fastest-growing user base - analyst note. Reuters. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
- 会計担当が38億円を詐欺グループに送金、ビデオ会議のCFOは偽物. (2024). CNN.co.jp. https://www.cnn.co.jp/world/35214839.html
- Hutchins, J. (1995). "The whisky was invisible", or Persistent myths of MT. MT News International, 11. https://web.archive.org/web/20210103041306/http://www.hutchinsweb.me.uk/MTNI-11-1995.pdf
- Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., & Wilson, A. G. (2018). Averaging weights leads to wider optima and better generalization. Proceedings of the Conference on Uncertainty in Artificial Intelligence.
- Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1), 2. https://doi.org/10.3390/technologies9010002
- James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An Introduction to Statistical Learning, with Applications in Python. Springer.
- Jin, M., Yu, Q., Shu, D., Zhao, H., Hua, W., Meng, Y., Zhang, Y., & Du, M. (2024). The impact of reasoning step length on large language models (L.-W. Ku, A. Martins, & V. Srikumar, Eds.; pp. 1830–1842). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-acl.108
- 金森 敬文, 鈴木 大慈, 竹内 一郎, & 佐藤 一誠. (2016). 機械学習のための連続最適化. 講談社.
- 金森 敬文. (2015). 統計的学習理論. 講談社.
- Jocher, G., Qiu, J., & Chaurasia, A. (2023). Ultralytics YOLO (Version 8.0.0). https://github.com/ultralytics/ultralytics
- John Schulman. (2023). Reinforcement Learning from Human Feedback: Progress and Challenges. https://www.youtube.com/watch?v=hhiLw5Q_UFg
- Jolliffe, I. T. (2010). Principal Component Analysis (2nd ed.). Springer.
- Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2
- Kallini, J., Papadimitriou, I., Futrell, R., Mahowald, K., & Potts, C. (2024). Mission: Impossible language models. In L.-W. Ku, A. Martins, & V. Srikumar (Eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 14691–14714). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-long.787
- Kamvar, S. D., Klein, D., & Manning, C. D. (2002). Interpreting and extending classical agglomerative clustering algorithms using a model-based approach. Proceedings of the 19th International Conference on Machine Learning, 283–290.
- Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. https://doi.org/10.48550/arXiv.2001.08361
- Khan, F. (2025). FareedKhan-dev/train-deepseek-r1. https://github.com/FareedKhan-dev/train-deepseek-r1
- Kim, J. (2025). kjsman/stable-diffusion-pytorch. https://github.com/kjsman/stable-diffusion-pytorch
- Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. Proceedings of 3rd International Conference for Learning Representations. http://arxiv.org/abs/1412.6980
- Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307–392. https://doi.org/10.1561/2200000056
- Kirk, H. R., Jun, Y., Iqbal, H., Benussi, E., Volpin, F., Dreyer, F. A., Shtedritski, A., & Asano, Y. M. (2024). Bias out-of-the-box: an empirical analysis of intersectional occupational biases in popular generative language models. Proceedings of the 35th International Conference on Neural Information Processing Systems, 2611–2624.
- Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1). https://doi.org/10.1177/2053951714528481
- Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. In C. H. Papadimitriou (Ed.), 8th Innovations in Theoretical Computer Science Conference (Vol. 67, pp. 43:1–43:23). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Germany. https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2017.43
- Kobak, D., González-Márquez, R., Horvát, E.-Á., & Lause, J. (2025). Delving into LLM-assisted writing in biomedical publications through excess vocabulary. Science Advances, 11(27), eadt3813. https://doi.org/10.1126/sciadv.adt3813
- Koenker, R. (2005). Quantile Regression (Number 38). Cambridge University Press.
- Koenker, R., & Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643
- Kojima, T., View Profile, Gu, S. S., View Profile, Reid, M., View Profile, Matsuo, Y., View Profile, Iwasawa, Y., & View Profile. (2022). Large language models are zero-shot reasoners. Proceedings of the 36th International Conference on Neural Information Processing Systems, 22199–22213. https://doi.org/10.5555/3600270.3601883
- Kotz, S., Balakrishnan, N., Read, C. B., & Vidakovic, B. (Eds.). (2006). Encyclopedia of Statistical Sciences (2nd ed). Wiley-Interscience.
- Lambert, N., Morrison, J., Pyatkin, V., Huang, S., Ivison, H., Brahman, F., Miranda, L. J. V., Liu, A., Dziri, N., Lyu, S., Gu, Y., Malik, S., Graf, V., Hwang, J. D., Yang, J., Bras, R. L., Tafjord, O., Wilhelm, C., Soldaini, L., … Hajishirzi, H. (2025). Tulu 3: Pushing Frontiers in Open Language Model Post-Training. https://doi.org/10.48550/arXiv.2411.15124
- Lane, M. (2024). Who will be the workers most affected by AI? A closer look at the impact of AI on women, low-skilled workers and other groups (Technical Report No.26; Number 26). OECD Publishing. https://doi.org/10.1787/14dc6f89-en
- Learn Prompting: Your Guide to Communicating with AI. (2023). https://learnprompting.org/
- Lee, K.-H., Fischer, I., Wu, Y.-H., Marwood, D., Baluja, S., Schuurmans, D., & Chen, X. (2025). Evolving deeper LLM thinking. https://doi.org/10.48550/arXiv.2501.09891
- Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E. (2016). Exact Post-Selection Inference, with Application to the Lasso. The Annals of Statistics, 44(3), 907–927. https://www.jstor.org/stable/43818915
- Li, P., Yang, J., Islam, M. A., & Ren, S. (2025). Making AI less ’thirsty.’ Commun. ACM, 68(7), 54–61. https://doi.org/10.1145/3724499
- 鈴木 大慈. (2015). 確率的最適化. 講談社.
- Li, C. (2020). OpenAI’s GPT-3 language model: a technical overview. https://lambdalabs.com/blog/demystifying-gpt-3
- Lipsey, R. G., Carlaw, K. I., & Bekar, C. T. (2005). Economic Transformations: General Purpose Technologies and Long-Term Economic Growth. Oxford University Press. https://doi.org/10.1093/oso/9780199285648.001.0001
- Liu, R., Gao, J., Zhao, J., Zhang, K., Li, X., Qi, B., Ouyang, W., & Zhou, B. (2025). Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling. https://doi.org/10.48550/arXiv.2502.06703
- Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Trans. Knowl. Discov. Data, 6(1), 3:1–3:39. https://doi.org/10.1145/2133360.2133363
- Li, T., Khashabi, D., Khot, T., Sabharwal, A., & Srikumar, V. (2020). UNQOVERing stereotyping biases via underspecified questions. In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 3475–3489). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.311
- Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018). Visualizing the loss landscape of neural nets. Advances in Neural Information Processing Systems, 31. https://proceedings.neurips.cc/paper_files/paper/2018/hash/a41b3bb3e6b050b6c9067c67f663b915-Abstract.html
- Luenberger, D. G. (1997). Optimization by Vector Space Methods (1st ed.). John Wiley & Sons, Inc.
- Lu, Y., & Morgan, J. L. (2020). Homophone auditory processing in cross-linguistic perspective. Proceedings of the Linguistic Society of America, 5(1), 529–542. https://doi.org/10.3765/plsa.v5i1.4733
- Mäkelä, E., & Stephany, F. (2025). Complement or substitute? How AI increases the demand for human skills (Number 5153230) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.5153230
- Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. https://doi.org/10.48550/arXiv.1603.09320
- Ma, Y., & Fu, Y. (2012). Manifold Learning Theory and Applications. CRC Press. http://www.amazon.com/Manifold-Learning-Theory-Applications-Yunqian/dp/1439871094/ref=pd_sim_sbs_b_1?ie=UTF8&refRID=1E2Y5KJQ1MJ48BK5C7BT
- Manning, C. D., Prabhakar Raghavan, & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
- Manzini, A., Keeling, G., Alberts, L., Vallor, S., Morris, M. R., & Gabriel, I. (2024). The code that binds us: navigating the appropriateness of human-AI assistant relationships. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 943–957. https://doi.org/10.1609/aies.v7i1.31694
- Marx, K. (1959). Economic & Philosophic Manuscripts of 1844 (M. Milligan, Tran.). Progress Publishers. https://www.marxists.org/archive/marx/works/1844/manuscripts/preface.htm
- Mata v. Avianca, Inc. (Number 1:22-cv-01461). (Number). District Court, S.D. New York. Retrieved July 13, 2025, from https://www.courtlistener.com/docket/63107798/mata-v-avianca-inc/
- 梅谷 俊治. (2020). しっかり学ぶ数理最適化 モデルからアルゴリズムまで. 講談社.
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, 2, 3111–3119.
- Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Noisy channel language model prompting for few-shot text classification. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5316–5330). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.365
- Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the role of demonstrations: what makes in-context learning work? In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 11048–11064). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.emnlp-main.759
- Miyamoto, S. (2012). Inductive and non-inductive methods of clustering. 2012 IEEE International Conference on Granular Computing, 12–17. https://doi.org/10.1109/GrC.2012.6468710
- Moffatt v. Air Canada. (2024). In CanLII (Vol. 149, Number SC-2023-005609). BCCRT. https://canlii.ca/t/k2spq
- Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of Machine Learning (2nd ed.). The MIT Press.
- Molnar, C. Interpretable Machine Learning. Retrieved February 23, 2025, from https://christophm.github.io/interpretable-ml-book/
- Molnar, C. Interpretable Machine Learning(邦訳). Retrieved February 23, 2025, from https://hacarus.github.io/interpretable-ml-book-ja/index.html
- Muennighoff, N., Yang, Z., Shi, W., Li, X. L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E., & Hashimoto, T. (2025). s1: Simple test-time scaling. https://doi.org/10.48550/arXiv.2501.19393
- Muennighoff, N., Rush, A. M., Barak, B., Scao, T. L., Tazi, N., Piktus, A., Pyysalo, S., Wolf, T., & Raffel, C. (2023, November 2). Scaling data-constrained language models. https://openreview.net/forum?id=j5BuTrEj35
- Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P. H. S., & Dokania, P. K. (2020). Calibrating deep neural networks using focal loss. 15288–15299.
- Natarajan, B. K. (1995). Sparse Approximate Solutions to Linear Systems. SIAM Journal on Computing, 24(2), 227–234. https://doi.org/10.1137/S0097539792240406
- ndl-lab/pdmocrdataset-part1. (2024). ndl-lab. https://github.com/ndl-lab/pdmocrdataset-part1
- Newman, M. E. J. (2004). Fast algorithm for detecting community structure in networks. Physical Review E, 69(6), 066133. https://doi.org/10.1103/PhysRevE.69.066133
- Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113. https://doi.org/10.1103/PhysRevE.69.026113
- Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
- Newman, M. E. J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103
- No Free Lunch Theorems. Retrieved February 15, 2025, from http://www.no-free-lunch.org/
- Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654), 187–192. https://doi.org/10.1126/science.adh2586
- NVIDIA announces financial results for first quarter fiscal 2026. (2025). NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-first-quarter-fiscal-2026
- O’Brien, P. C., & Fleming, T. R. (1979). A multiple testing procedure for clinical trials. Biometrics, 35(3), 549–556. https://doi.org/10.2307/2530245
- Odena, A., Dumoulin, V., & Olah, C. (2016). Deconvolution and checkerboard artifacts. Distill, 1(10), e3. https://doi.org/10.23915/distill.00003
- Oikarinen, T., & Weng, T.-W. (2022, September 29). CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks. https://openreview.net/forum?id=iPWiwWHc1V
- OpenAI Platform. Retrieved August 29, 2024, from https://platform.openai.com
- OpenAI. (2023). GPT-4 technical report. https://doi.org/10.48550/arXiv.2303.08774
- OpenAI. (2024). Learning to reason with LLMs. https://openai.com/index/learning-to-reason-with-llms/
- Osband, K. H. (1985). Providing Incentives for Better Cost Forecasting [Phdthesis]. University of California, Berkeley.
- Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., & others. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
- OWASP. (2024). OWASP Top 10 for LLM applications & generative AI (Technical Report OWASP PDF v4.2.0a 20241114-202703; Number OWASP PDF v4.2.0a 20241114-202703).
- Parry, R. M., Jones, W., Stokes, T. H., Phan, J. H., Moffitt, R. A., Fang, H., Shi, L., Oberthuer, A., Fischer, M., Tong, W., & Wang, M. D. (2010). k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. The Pharmacogenomics Journal, 10(4), 292–309. https://doi.org/10.1038/tpj.2010.56
- Pelletier, B. (2024). On the statistical properties of the isolation forest anomaly detection method. https://hal.science/hal-04430185
- Penedo, G., Malartic, Q., Hesslow, D., Cojocaru, R., Cappelli, A., Alobeidli, H., Pannier, B., Almazrouei, E., & Launay, J. (2023). The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. https://doi.org/10.48550/arXiv.2306.01116
- Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. (2023). The impact of AI on developer productivity: evidence from GitHub Copilot. https://doi.org/10.48550/arXiv.2302.06590
- Petersen, K. B., & Pedersen, M. S. (2012/nov). The Matrix Cookbook. Technical University of Denmark. http://www2.compute.dtu.dk/pubdb/pubs/3274-full.html
- 坪井 祐太, 海野 裕也, & 鈴木 潤. (2017). 深層学習による自然言語処理. 講談社.
- Plesner, A., Vontobel, T., & Wattenhofer, R. (2024). Breaking reCAPTCHAv2. https://doi.org/10.1109/COMPSAC61105.2024.00142
- Murphy, K. P. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. https://probml.github.io/pml-book/book1.html
- Prompt Engineering Guide – Nextra. Retrieved August 29, 2024, from https://www.promptingguide.ai/
- Proschan, M. A., Lan, K. K. G., & Wittes, J. T. (2006). Statistical Monitoring of Clinical Trials: A Unified Approach. Springer.
- Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learning, 8748–8763. https://proceedings.mlr.press/v139/radford21a.html
- Rafailov, R., Hejna, J., Park, R., & Finn, C. (2024, August 26). From \r to {Q^*\: Your Language Model is Secretly a Q-Function. https://openreview.net/forum?id=kEVcNxtqXk
- Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023, November 2). Direct preference optimization: your language model is secretly a reward model. https://openreview.net/forum?id=HPuSIXJaa9
- Raghu, M., Zhang, C., Kleinberg, J., & Bengio, S. (2019). Transfusion: understanding transfer learning for medical imaging. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 301, pp. 3347–3357). Curran Associates Inc.
- Raschka, S., & Mirjalili, V. (2020). Python機械学習プログラミング:達人データサイエンティストによる理論と実践 (福島 真太朗 & 株式会社クイープ, Trans.; 第3版). インプレス.
- Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning. The MIT Press.
- Razeghi, Y., Logan IV, R. L., Gardner, M., & Singh, S. (2022). Impact of pretraining term frequencies on few-shot numerical reasoning. In Y. Goldberg, Z. Kozareva, & Y. Zhang (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 840–854). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.59
- A generalist agent. (2022). Transactions on Machine Learning Research. https://openreview.net/forum?id=1ikK0kHjvj
- 人工知能関連技術の研究開発及び活用の推進に関する法律, (2025). https://laws.e-gov.go.jp/law/507AC0000000053
- 人工知能学会. (2023年5月版). AIマップβ2.0. https://www.ai-gakkai.or.jp/aimap/
- 日本放送協会. (2025). 「性的ディープフェイク」で行政罰の条例改正案を可決 鳥取県. NHKニュース. https://www3.nhk.or.jp/news/html/20250630/k10014848661000.html
- Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr., 3(4), 333–389. https://doi.org/10.1561/1500000019
- Rokach, L., & Maimon, O. (2014). Data Mining with Decision Trees: Theory and Applications (2nd ed., Vol. 81). WORLD SCIENTIFIC. https://doi.org/10.1142/9097
- Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. 10674–10685. https://doi.org/10.1109/CVPR52688.2022.01042
- Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (pp. 234–241). Springer International Publishing. https://doi.org/10.1007/978-3-319-24574-4_28
- R, F. I. R. T. H. J. (1957). A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis. https://cir.nii.ac.jp/crid/1570854175539816192?lang=en
- Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252. https://doi.org/10.1007/s11263-015-0816-y
- Sammut, C., & Webb, G. I. (Eds.). (2017). Encyclopedia of Machine Learning and Data Mining. Springer US. https://doi.org/10.1007/978-1-4899-7687-1
- Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023). Whose opinions do language models reflect? Proceedings of the 40th International Conference on Machine Learning, 202, 29971–30004.
- Sardana, N., Portes, J., Doubov, S., & Frankle, J. (2024). Beyond Chinchilla-optimal: accounting for inference in language model scaling laws. Proceedings of the 41st International Conference on Machine Learning, 235, 43445–43460.
- Schaeffer, R., Miranda, B., & Koyejo, S. (2023, November 2). Are emergent abilities of large language models a mirage? https://openreview.net/forum?id=ITw9edRDlD
- Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst., 42(3), 19:1–19:21. https://doi.org/10.1145/3068335
- Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., Li, Y., Gupta, A., Han, H. J., Schulhoff, S., Dulepet, P. S., Vidyadhara, S., Ki, D., Agrawal, S., Pham, C., Kroiz, G., Li, F., Tao, H., Srivastava, A., … Resnik, P. (2024). The Prompt Report: A Systematic Survey of Prompting Techniques. http://arxiv.org/abs/2406.06608
- Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
- Seber, G. A. F., & Lee, A. J. (2003). Linear Regression Analysis (2nd ed.). Wiley.
- Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. https://www.cs.huji.ac.il/ shais/UnderstandingMachineLearning/copy.html
- 山田 育矢, 鈴木 正敏, 山田 康輔, & 李 凌寒. (2023). 大規模言語モデル入門. 技術評論社.
- 山田 育矢, 鈴木 正敏, 西川 荘介, 藤井 一喜, 山田 康輔, & 李 凌寒. (2024). 大規模言語モデル入門Ⅱ〜生成型LLMの実装と評価. 技術評論社.
- 山下 信雄. (2015). 非線形計画法. 朝倉書店. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003278170
- Sharif, M., Bhagavatula, S., Bauer, L., & Reiter, M. K. (2016). Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 1528–1540. https://doi.org/10.1145/2976749.2978392
- 神嶌 敏弘. (2019). 変わりゆく機械学習と変わらない機械学習. 日本物理学会誌, 74(1), 5–13. https://doi.org/10.11316/butsuri.74.1_5
- 生成AIでランサムウェアを作成した容疑者の摘発事例を考察. (2025). Trend Micro. https://www.trendmicro.com/ja_jp/jp-security/24/e/breaking-securitynews-20240529-02.html
- 生成AI悪用し楽天モバイルに不正アクセス、1000件以上の回線入手し転売か…容疑で中高生3人逮捕. (2025). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20250226-OYT1T50205/
- 生成AI悪用しウイルス作成、有罪判決…IT知識なくとも「1か月ぐらいで簡単に作れた」. (2024). 読売新聞オンライン. https://www.yomiuri.co.jp/national/20241025-OYT1T50209/
- 石黒 勝彦, & 林 浩平. (2016). 関係データ学習. 講談社.
- 石井 健一郎, & 上田 修功. (2014). 教師なし学習入門 (わかりやすいパターン認識 続). オーム社.
- Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. R., & Yao, S. (2023, November 2). Reflexion: language agents with verbal reinforcement learning. https://openreview.net/forum?id=vAElhFcKW6
- SIGKDD News: 2014 SIGKDD Test of Time Award. (2014). https://www.kdd.org/News/view/2014-sigkdd-test-of-time-award
- Singhal, S., Zeng, J., Bukharin, A., Zhang, Y., Shen, G., Mahabaleshwarkar, A. S., Kartal, B., Suhara, Y., Bercovich, A., Levy, I., Golan, I., Dabbah, M., El-Yaniv, R., Majumdar, S., Gitman, I., Bakhturina, E., Zhang, J. J., Su, B.-Y., Huang, G., … Konuk, T. (2025, June 21). Llama-Nemotron: Efficient Reasoning Models. https://openreview.net/forum?id=ev1xpo9mbI&referrer=%5Bthe%20profile%20of%20Olivier%20Delalleau%5D(%2Fprofile%3Fid%3D Olivier_Delalleau1)
- Snell, C. V., Lee, J., Xu, K., & Kumar, A. (2024, October 4). Scaling LLM test-time compute optimally can be more effective than scaling parameters for reasoning. https://openreview.net/forum?id=4FWAwZtd2n
- Song, Y., & Ermon, S. (2019). Generative modeling by estimating gradients of the data distribution. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1067, pp. 11918–11930). Curran Associates Inc.
- Song, Y. (2021). Generative modeling by estimating gradients of the data distribution. Yang Song. https://yang-song.net/blog/2021/score
- Song, Y., Durkan, C., Murray, I., & Ermon, S. (2021, November 9). Maximum likelihood training of score-based diffusion models. https://openreview.net/forum?id=AklttWFnxS9
- Speech and Language Processing. Retrieved June 18, 2025, from https://web.stanford.edu/ jurafsky/slp3/
- Steinwart, I., Pasin, C., Williamson, R., & Zhang, S. (2014). Elicitation and identification of properties. Proceedings of The 27th Conference on Learning Theory, 482–526. https://proceedings.mlr.press/v35/steinwart14.html
- Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. Proceedings of the 34th International Conference on Neural Information Processing Systems, 3008–3021.
- Storchan, V., Kumar, R., Chowdhury, R., Goldfarb-Tarrant, S., & Cattell, S. (2024). Generative AI red teaming challenge: transparency report [Technical Report]. DEF CON. https://drive.google.com/file/d/1JqpbIP6DNomkb32umLoiEPombK2-0Rc-/view
- Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., & Liu, Y. (2024). RoFormer: enhanced transformer with rotary position embedding. Neurocomputing, 568(C). https://doi.org/10.1016/j.neucom.2023.127063
- Szeliski, R. (2022). Computer Vision: Algorithms and Applications. Springer. https://szeliski.org/Book/
- Tabassi, E. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology. https://doi.org/10.6028/nist.ai.100-1
- Takami Sato. (2024). 最適化超入門. https://speakerdeck.com/tkm2261/zui-shi-hua-chao-ru-men
- Tal, Y., Magar, I., & Schwartz, R. (2022). Fewer errors, but more stereotypes? The effect of model size on gender bias. In C. Hardmeier, C. Basta, M. R. Costa-jussà, G. Stanovsky, & H. Gonen (Eds.), Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 112–120). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.gebnlp-1.13
- Tamkin, A., McCain, M., Handa, K., Durmus, E., Lovitt, L., Rathi, A., Huang, S., Mountfield, A., Hong, J., Ritchie, S., Stern, M., Clarke, B., Goldberg, L., Sumers, T. R., Mueller, J., McEachen, W., Mitchell, W., Carter, S., Clark, J., … Ganguli, D. (2024). Clio: privacy-preserving insights into real-world ai use. https://doi.org/10.48550/arXiv.2412.13678
- Taori, R., Gulrajani, I., Zhang, T., Dubois, Y., Li, X., Guestrin, C., Liang, P., & Hashimoto, T. B. (2023). Stanford Alpaca: An Instruction-following LLaMA model [Data set]. stanford_alpaca.
- 特許法. Retrieved July 18, 2025, from https://laws.e-gov.go.jp/law/334AC0000000121#Mp-Ch_1
- 特許庁. (2019). 特許・実用新案審査ハンドブック 附属書B 第1章 コンピュータソフトウエア関連発明. https://www.jpo.go.jp/system/laws/rule/guideline/patent/handbook_shinsa/document/index/app_b1.pdf
- Thakur, A. S., Choudhary, K., Ramayapally, V. S., Vaidyanathan, S., & Hupkes, D. (2024). Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. https://doi.org/10.48550/arXiv.2406.12624
- THU-MIG/yolov10. (2024). THU-MIG. https://github.com/THU-MIG/yolov10
- Todeschini, R., Ballabio, D., & Consonni, V. (2020). Distances and similarity measures in chemometrics and chemoinformatics. In Encyclopedia of Analytical Chemistry (pp. 1–40). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470027318.a9438.pub2
- Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer.
- Ultralytics. Ultralytics YOLO11 object detection model. Retrieved June 12, 2025, from https://github.com/ultralytics/ultralytics/blob/da98efc61d9e0467315fc86c2297c8d81e656b1a/ultralytics/cfg/models/11/yolo11.yaml
- Ustalov, D., & Lambert, N. (Jul 24, 2023, 20:00 UTC). Reinforcement Learning from Human Feedback: A Tutorial [Tutorial]. https://icml.cc/virtual/2023/tutorial/21554
- Neural discrete representation learning. (2017). Proceedings of the 31st International Conference on Neural Information Processing Systems, 6309–6318.
- Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007). Experimental perspectives on learning from imbalanced data. Proceedings of the 24th International Conference on Machine Learning, 935–942. https://doi.org/10.1145/1273496.1273614
- Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer New York. https://doi.org/10.1007/978-1-4757-3264-1
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000–6010.
- Vershynin, R. (2018). High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press.
- Vincent, P. (2011). A connection between score matching and denoising autoencoders. Neural Computation, 23(7), 1661–1674. https://doi.org/10.1162/NECO_a_00142
- Transformers learn in-context by gradient descent. (2022). https://arxiv.org/abs/2212.07677v2
- Vovk, V., Gammerman, A., & Shafer, G. (2022). Algorithmic Learning in a Random World. Springer International Publishing. https://doi.org/10.1007/978-3-031-06649-8
- Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint (1st ed.). Cambridge University Press. https://www.cambridge.org/core/product/identifier/9781108627771/type/book
- Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2011). Class imbalance, redux. IEEE 11th International Conference on Data Mining, 754–763. https://doi.org/10.1109/ICDM.2011.33
- Wang, X., & Zhou, D. (2024, November 6). Chain-of-Thought Reasoning Without Prompting. https://openreview.net/forum?id=4Zt7S0B0Jp
- Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (T. Linzen, G. Chrupała, & A. Alishahi, Eds.; pp. 353–355). Association for Computational Linguistics. https://doi.org/10.18653/v1/W18-5446
- Wang, Y., Li, H., Han, X., Nakov, P., & Baldwin, T. (2023). Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs. https://doi.org/10.48550/arXiv.2308.13387
- Wang, Y., Yang, Q., Zeng, Z., Ren, L., Liu, L., Peng, B., Cheng, H., He, X., Wang, K., Gao, J., Chen, W., Wang, S., Du, S. S., & Shen, Y. (2025). Reinforcement learning for reasoning in large language models with one training example. https://doi.org/10.48550/arXiv.2504.20571
- Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. https://doi.org/10.48550/arXiv.2405.14458
- Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real-Time End-to-End Object Detection. Advances in Neural Information Processing Systems, 37, 107984–108011. https://proceedings.neurips.cc/paper_files/paper/2024/hash/c34ddd05eb089991f06f3c5dc36836e0-Abstract-Conference.html
- Ward Jr., J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244. https://doi.org/10.1080/01621459.1963.10500845
- Warner, B., Chaffin, A., Clavié, B., Weller, O., Hallström, O., Taghadouini, S., Gallagher, A., Biswas, R., Ladhak, F., Aarsen, T., Cooper, N., Adams, G., Howard, J., & Poli, I. (2024). Smarter, better, faster, longer: a modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference. https://doi.org/10.48550/arXiv.2412.13663
- Webb, M. (2019). The Impact of Artificial Intelligence on the Labor Market (Number 3482150) [SSRN Scholarly Paper]. https://doi.org/10.2139/ssrn.3482150
- Webb, E. J., Campbell, D. T., Schwartz, R. D., & Sechrest, L. (1966). Unobtrusive Measures: Nonreactive Research in the Social Sciences (pp. xii, 225). Rand Mcnally.
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2024). Chain-of-thought prompting elicits reasoning in large language models. Proceedings of the 36th International Conference on Neural Information Processing Systems, 24824–24837.
- Weidinger, L., Barnhart, J., Brennan, J., Butterfield, C., Young, S., Hawkins, W., Hendricks, L. A., Comanescu, R., Chang, O., Rodriguez, M., Beroshi, J., Bloxwich, D., Proleev, L., Chen, J., Farquhar, S., Ho, L., Gabriel, I., Dafoe, A., & Isaac, W. (2024). Holistic safety and responsibility evaluations of advanced AI models. https://doi.org/10.48550/arXiv.2404.14068
- Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., & Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research. https://openreview.net/forum?id=yzkSU5zdwD
- Wei, J., Wei, J., Tay, Y., Tran, D., Webson, A., Lu, Y., Chen, X., Liu, H., Huang, D., Zhou, D., & Ma, T. (2023). Larger language models do in-context learning differently. https://openreview.net/forum?id=DRGnEkbiQZ
- Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb), 207–244. http://www.jmlr.org/papers/v10/weinberger09a.html
- Welling, M., & Teh, Y. W. (2011). Bayesian learning via stochastic gradient langevin dynamics. Proceedings of the 28th International Conference on International Conference on Machine Learning, 681–688.
- 文化審議会著作権分科会法制度小委員会. (2024). AIと著作権に関する考え方について. https://www.bunka.go.jp/seisaku/bunkashingikai/chosakuken/pdf/94037901_01.pdf
- Wen, X., Liu, Z., Zheng, S., Xu, Z., Ye, S., Wu, Z., Liang, X., Wang, Y., Li, J., Miao, Z., Bian, J., & Yang, M. (2025). Reinforcement learning with verifiable rewards implicitly incentivizes correct reasoning in base LLMs. https://doi.org/10.48550/arXiv.2506.14245
- Wettig, A., Gao, T., Zhong, Z., & Chen, D. (2023). Should you mask 15% in masked language modeling? In A. Vlachos & I. Augenstein (Eds.), Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (pp. 2985–3000). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.eacl-main.217
- Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390. https://doi.org/10.1162/neco.1996.8.7.1341
- Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
- Wolpert, D. H. (2002). The supervised learning no-free-lunch theorems. In R. Roy, M. Köppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.), Soft Computing and Industry: Recent Applications (pp. 25–42). Springer. https://doi.org/10.1007/978-1-4471-0123-9_3
- workpiles. (2016). CUCUMBER-9. https://github.com/workpiles/CUCUMBER-9
- World Economic Forum. (2025). Future of Jobs Report 2025 [Insight Report]. https://www.weforum.org/publications/the-future-of-jobs-report-2025/digest/
- Writers Guild of America. (2023). Summary of the 2023 WGA MBA. WGA Contract 2023. https://www.wgacontract2023.org/the-campaign/summary-of-the-2023-wga-mba
- Wu, Y., & He, K. (2018). Group normalization. Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XIII, 3–19. https://doi.org/10.1007/978-3-030-01261-8_1
- Wu, T., Lan, J., Yuan, W., Jiao, J., Weston, J. E., & Sukhbaatar, S. (2025, June 18). Thinking LLMs: general instruction following with thought generation. https://openreview.net/forum?id=z6SrgYCdey¬eId=t3y0Ev0lm6
- 篠田 一聡. (2024年8月25日). 論文紹介:Direct Preference Optimization: Your Language Model Is Secretly a Reward Model. https://speakerdeck.com/kazutoshishinoda/lun-wen-shao-jie-direct-preference-optimization-your-language-model-is-secretly-a-reward-model
- Xiong, R., Yang, Y., He, D., Zheng, K., Zheng, S., Xing, C., Zhang, H., Lan, Y., Wang, L., & Liu, T.-Y. (2020). On layer normalization in the transformer architecture. Proceedings of the 37th International Conference on Machine Learning, 119, 10524–10533.
- Xu, D., & Tian, Y. (2015). A Comprehensive Survey of Clustering Algorithms. Annals of Data Science, 2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1
- Xu, H., Xie, S., Tan, X., Huang, P.-Y., Howes, R., Sharma, V., Li, S.-W., Ghosh, G., Zettlemoyer, L., & Feichtenhofer, C. (2023, October 13). Demystifying CLIP data. https://openreview.net/forum?id=5BCFlnfE1g
- 穴井 宏和, & 斉藤 努. (2015). 今日から使える!組合せ最適化 離散問題ガイドブック. 講談社.
- 穴井 宏和. (2013). 数理最適化の実践ガイド. 講談社.
- Yang, G., & Hu, E. J. (2022). Feature Learning in Infinite-Width Neural Networks. https://doi.org/10.48550/arXiv.2011.14522
- Yang, G., Simon, J. B., & Bernstein, J. (2024). A Spectral Condition for Feature Learning. https://doi.org/10.48550/arXiv.2310.17813
- Yang, G., Hu, E. J., Babuschkin, I., Sidor, S., Farhi, D., Pachocki, J., Liu, X., Chen, W., & Gao, J. (2022, March 7). Tensor programs V: tuning large neural networks via zero-shot hyperparameter transfer. https://www.microsoft.com/en-us/research/publication/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer/
- 岩田 具治. (2015). トピックモデル. 講談社.
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2022, September 29). ReAct: synergizing reasoning and acting in language models. https://openreview.net/forum?id=WE_vluYUL-X
- Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. R. (2023, November 2). Tree of Thoughts: deliberate problem solving with large language models. https://openreview.net/forum?id=5Xc1ecxO1h
- 尭之 新田. 生成AIが描く日本の職業の明暗とその対応策.
- Ye, P., Qian, J., Chen, J., Wu, C.-hung, Zhou, Y., De Mars, S., Yang, F., & Zhang, L. (2018). Customized Regression Model for Airbnb Dynamic Pricing. 932–940. https://doi.org/10.1145/3219819.3219830
- 原田 達也. (2017). 画像認識. 講談社. http://opac.dl.itc.u-tokyo.ac.jp/opac/opac_details/?lang=0&amode=11&bibid=2003372347
- Yue, Z., Zhuang, H., Bai, A., Hui, K., Jagerman, R., Zeng, H., Qin, Z., Wang, D., Wang, X., & Bendersky, M. (2024, October 4). Inference Scaling for Long-Context Retrieval Augmented Generation. https://openreview.net/forum?id=FSjIrOm1vz
- Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33(4), 452–473. http://www.jstor.org/stable/3629752
- Zamanou, S., & Glaser, S. R. (1994). Moving toward participation and involvement: Managing and measuring organizational culture. Group & Organization Management, 19(4), 475–502. https://doi.org/10.1177/1059601194194005
- Zelikman, E., Harik, G. R., Shao, Y., Jayasiri, V., Haber, N., & Goodman, N. (2024, August 26). Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking. https://openreview.net/forum?id=oRXPiSOGH9
- Zelikman, E., Wu, Y., Mu, J., & Goodman, N. (2022, October 31). STaR: bootstrapping reasoning with reasoning. https://openreview.net/forum?id=_3ELRdg2sgI
- Zeng, Z., Yu, J., Gao, T., Meng, Y., Goyal, T., & Chen, D. (2023, October 13). Evaluating Large Language Models at Evaluating Instruction Following. https://openreview.net/forum?id=tr0KidwPLc
- 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.
- 増田 直紀, & 今野 紀雄. (2010). 複雑ネットワーク―基礎から応用まで. 近代科学社.
- Zhang, Y., & Teng, Z. (2021). Natural Language Processing: A Machine Learning Perspective. Cambridge University Press.
- Zhang, B., & Sennrich, R. (2019). Root mean square layer normalization. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (Number 1110, pp. 12381–12392). Curran Associates Inc.
- Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. https://doi.org/10.48550/arXiv.1611.03530
- Zhao, X., Kang, Z., Feng, A., Levine, S., & Song, D. (2025). Learning to reason without external rewards. https://doi.org/10.48550/arXiv.2505.19590
- Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023, November 2). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. https://openreview.net/forum?id=uccHPGDlao
- Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., Zhang, S., Ghosh, G., Lewis, M., Zettlemoyer, L., & Levy, O. (2023, November 2). LIMA: less is more for alignment. https://openreview.net/forum?id=KBMOKmX2he
- Zhou, Z.-H. (2021). Machine Learning. Springer Singapore. https://doi.org/10.1007/978-981-15-1967-3
- Zhou, C., Yu, L., Babu, A., Tirumala, K., Yasunaga, M., Shamis, L., Kahn, J., Ma, X., Zettlemoyer, L., & Levy, O. (2024, October 4). Transfusion: predict the next token and diffuse images with one multi-modal model. https://openreview.net/forum?id=SI2hI0frk6
- 周志 华. (2022). 機械学習 (大和田勇 人, 玄光 男, 下川朝 有, & 郝新 厂, Trans.). 近代科学社.
- 竹村 彰通. (2020). 新装改訂版 現代数理統計学. 学術図書出版社.
- 著作権法第三十条第四項, (2019). https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_3-Ss_5-At_30_4
- 著作権法第十条第三項第三号. Retrieved July 17, 2025, from https://laws.e-gov.go.jp/law/345AC0000000048#Mp-Ch_2-Se_1
- 著作権法の一部を改正する法律(平成30年法律第30号)について | 文化庁. Retrieved July 17, 2025, from https://www.bunka.go.jp/seisaku/chosakuken/hokaisei/h30_hokaisei/