最新消息:20210917 已从crifan.com换到crifan.org

【已解决】Python中检测代码文本是什么编程语言

Python crifan 444浏览 0评论
折腾:
【未解决】用Python把印象笔记中代码段en-codeblock替换成html的pre或code
期间,希望对于输入一段文字
用Python能检测出:是什么编程语言
用于后续pre的brush的值
目前至少希望支持
  • shell
  • python
  • java
  • 最好支持
    • C#
    • Swift
    • C
  • 其他等等
python language detection
python detect programming language
guesslang · PyPI
pip install guesslang
python – Is there a library that will detect the source code language of a block of code? – Stack Overflow
Detecting programming language from a snippet – Stack Overflow
blackducksoftware/ohcount: The Ohloh source code line counter
github/linguist: Language Savant. If your repository’s language is being reported incorrectly, send us a pull request!
Automatically detect the programming language of given script in python – Stack Overflow
yoeo/guesslang: Detect the programming language of a source code
pip3 install guesslang
src-d/enry: A faster file programming language detector
Guesslang documentation — Guesslang 0.9.4 documentation
去试试:
pip install guesslang
结果能找到,但是却没有匹配版本:
【已解决】Mac中安装guesslang报错:ERROR No matching distribution found for tensorflow 1.7.0rc1 from guesslang
再去写代码试试
Guesslang package — Guesslang 0.9.4 documentation
代码:
from guesslang import Guess

def detectProgramLanguage(codeSnippet):
    """Detect code snippet possible programming language

    Args:
        codeSnippet (str): input string of code snippet
    Returns:
        str, programming language
    Raises:
    """
    guessInstance = Guess()
    languageName = guessInstance.language_name(codeSnippet)
    return languageName
结果import guesslang,就很,要耗时很长,还有警告:
/Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:521: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.7
  return f(*args, **kwds)
然后对于输入:
➜  ~ pyenv install 3.7.3
python-build: use openssl from homebrew
python-build: use readline from homebrew
Downloading Python-3.7.3.tar.xz...
-> https://www.python.org/ftp/python/3.7.3/Python-3.7.3.tar.xz
检测有点慢
guessInstance = Guess()
要等好几多秒,输出
WARNING:tensorflow:From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
20200308 08:21:45 tf_logging.py:126  WARNING From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
WARNING:tensorflow:From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/guesslang/guesser.py:60: calling DNNLinearCombinedClassifier.__init__ (from tensorflow.contrib.learn.python.learn.estimators.dnn_linear_combined) with fix_global_step_increment_bug=False is deprecated and will be removed after 2017-04-15.
Instructions for updating:
Please set fix_global_step_increment_bug=True and update training steps in your pipeline. See pydoc for details.
20200308 08:21:47 tf_logging.py:126  WARNING From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/guesslang/guesser.py:60: calling DNNLinearCombinedClassifier.__init__ (from tensorflow.contrib.learn.python.learn.estimators.dnn_linear_combined) with fix_global_step_increment_bug=False is deprecated and will be removed after 2017-04-15.
Instructions for updating:
Please set fix_global_step_increment_bug=True and update training steps in your pipeline. See pydoc for details.
WARNING:tensorflow:From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py:676: multi_class_head (from tensorflow.contrib.learn.python.learn.estimators.head) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.contrib.estimator.*_head.
20200308 08:21:47 tf_logging.py:126  WARNING From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py:676: multi_class_head (from tensorflow.contrib.learn.python.learn.estimators.head) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.contrib.estimator.*_head.
WARNING:tensorflow:From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py:1165: BaseEstimator.__init__ (from tensorflow.contrib.learn.python.learn.estimators.estimator) is deprecated and will be removed in a future version.
Instructions for updating:
Please replace uses of any Estimator from tf.contrib.learn with an Estimator from tf.estimator.*
20200308 08:21:47 tf_logging.py:126  WARNING From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py:1165: BaseEstimator.__init__ (from tensorflow.contrib.learn.python.learn.estimators.estimator) is deprecated and will be removed in a future version.
Instructions for updating:
Please replace uses of any Estimator from tf.contrib.learn with an Estimator from tf.estimator.*
WARNING:tensorflow:From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py:427: RunConfig.__init__ (from tensorflow.contrib.learn.python.learn.estimators.run_config) is deprecated and will be removed in a future version.
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.
20200308 08:21:47 tf_logging.py:126  WARNING From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py:427: RunConfig.__init__ (from tensorflow.contrib.learn.python.learn.estimators.run_config) is deprecated and will be removed in a future version.
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.
INFO:tensorflow:Using default config.
20200308 08:21:47 tf_logging.py:116  INFO    Using default config.
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x12c17b278>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_log_step_count_steps': 100, '_session_config': None, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': '/Users/crifan/.local/share/virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/guesslang/data/model'}
20200308 08:21:47 tf_logging.py:116  INFO    Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x12c17b278>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_log_step_count_steps': 100, '_session_config': None, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': '/Users/crifan/.local/share/virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/guesslang/data/model'}
然后:
languageName = guessInstance.language_name(codeSnippet)
速度正常,输出:
DEBUG:tensorflow:Transforming feature_column _RealValuedColumn(column_name='', dimension=1024, default_value=None, dtype=tf.float32, normalizer=None)
DEBUG:tensorflow:Transforming feature_column _RealValuedColumn(column_name='', dimension=1024, default_value=None, dtype=tf.float32, normalizer=None)
WARNING:tensorflow:From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/head.py:678: ModelFnOps.__new__ (from tensorflow.contrib.learn.python.learn.estimators.model_fn) is deprecated and will be removed in a future version.
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.EstimatorSpec. You can use the `estimator_spec` method to create an equivalent one.
20200308 08:22:37 tf_logging.py:126  WARNING From /Users/crifan/.virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/head.py:678: ModelFnOps.__new__ (from tensorflow.contrib.learn.python.learn.estimators.model_fn) is deprecated and will be removed in a future version.
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.EstimatorSpec. You can use the `estimator_spec` method to create an equivalent one.
INFO:tensorflow:Graph was finalized.
20200308 08:22:37 tf_logging.py:116  INFO    Graph was finalized.
2020-03-08 20:22:37.351329: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
INFO:tensorflow:Restoring parameters from /Users/crifan/.local/share/virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/guesslang/data/model/model.ckpt-40198
20200308 08:22:37 tf_logging.py:116  INFO    Restoring parameters from /Users/crifan/.local/share/virtualenvs/EvernoteToWordpress-PC3x4gk8/lib/python3.7/site-packages/guesslang/data/model/model.ckpt-40198
INFO:tensorflow:Running local_init_op.
20200308 08:22:37 tf_logging.py:116  INFO    Running local_init_op.
INFO:tensorflow:Done running local_init_op.
20200308 08:22:37 tf_logging.py:116  INFO    Done running local_init_op.
出来结果是:
CSS
也不对:
【总结】
此处鉴于:虽然貌似guesslang,能支持很多中语言的检测
但是速度太慢:要几十秒才能出结果,不能忍受。放弃。
【后记 20210102】
改用自己写代码检测:
【已解决】用Python实现从代码字符串中检测出是何种编程语言

转载请注明:在路上 » 【已解决】Python中检测代码文本是什么编程语言

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
92 queries in 0.185 seconds, using 23.44MB memory