Codebase list python-magic-ahupp / a43b2cd4-032e-4fd5-a238-8a6080e17cbf/upstream
Import upstream version 0.4.20 Kali Janitor 3 years ago
27 changed file(s) with 1511 addition(s) and 633 deletion(s). Raw diff Collapse all Expand all
+0
-2
.gitignore less more
0 deb_dist
1 python_magic.egg-info
+0
-27
.travis.yml less more
0 language: python
1
2 # needed to use trusty
3 sudo: required
4
5 dist: trusty
6
7 python:
8 - "2.6"
9 - "2.7"
10 - "3.3"
11 - "3.4"
12 - "3.5"
13 - "3.6"
14 - "nightly"
15
16 install:
17 - pip install coveralls
18 - pip install codecov
19 - python setup.py install
20
21 script:
22 - coverage run setup.py test
23
24 after_success:
25 - coveralls
26 - codecov
1818 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
1919 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
2020 SOFTWARE.
21
22
23 ====
24
25 Portions of this package (magic/compat.py and test/libmagic_test.py)
26 are distributed under the following copyright notice:
27
28
29 $File: LEGAL.NOTICE,v 1.15 2006/05/03 18:48:33 christos Exp $
30 Copyright (c) Ian F. Darwin 1986, 1987, 1989, 1990, 1991, 1992, 1994, 1995.
31 Software written by Ian F. Darwin and others;
32 maintained 1994- Christos Zoulas.
33
34 This software is not subject to any export provision of the United States
35 Department of Commerce, and may be exported to any country or planet.
36
37 Redistribution and use in source and binary forms, with or without
38 modification, are permitted provided that the following conditions
39 are met:
40 1. Redistributions of source code must retain the above copyright
41 notice immediately at the beginning of the file, without modification,
42 this list of conditions, and the following disclaimer.
43 2. Redistributions in binary form must reproduce the above copyright
44 notice, this list of conditions and the following disclaimer in the
45 documentation and/or other materials provided with the distribution.
46
47 THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
48 ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
49 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
50 ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
51 ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
52 DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
53 OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
54 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
55 LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
56 OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
57 SUCH DAMAGE.
00 include *.py
11 include LICENSE
2 include test/testdata/*
3 include test/*.sh
2 graft tests
3 global-exclude __pycache__
4 global-exclude *.py[co]
0 Metadata-Version: 2.1
1 Name: python-magic
2 Version: 0.4.20
3 Summary: File type identification using libmagic
4 Home-page: http://github.com/ahupp/python-magic
5 Author: Adam Hupp
6 Author-email: [email protected]
7 License: MIT
8 Description: # python-magic
9 [![PyPI version](https://badge.fury.io/py/python-magic.svg)](https://badge.fury.io/py/python-magic)
10 [![Build Status](https://travis-ci.org/ahupp/python-magic.svg?branch=master)](https://travis-ci.org/ahupp/python-magic)
11
12 python-magic is a Python interface to the libmagic file type
13 identification library. libmagic identifies file types by checking
14 their headers according to a predefined list of file types. This
15 functionality is exposed to the command line by the Unix command
16 `file`.
17
18 ## Usage
19
20 ```python
21 >>> import magic
22 >>> magic.from_file("testdata/test.pdf")
23 'PDF document, version 1.2'
24 # recommend using at least the first 2048 bytes, as less can produce incorrect identification
25 >>> magic.from_buffer(open("testdata/test.pdf", "rb").read(2048))
26 'PDF document, version 1.2'
27 >>> magic.from_file("testdata/test.pdf", mime=True)
28 'application/pdf'
29 ```
30
31 There is also a `Magic` class that provides more direct control,
32 including overriding the magic database file and turning on character
33 encoding detection. This is not recommended for general use. In
34 particular, it's not safe for sharing across multiple threads and
35 will fail throw if this is attempted.
36
37 ```python
38 >>> f = magic.Magic(uncompress=True)
39 >>> f.from_file('testdata/test.gz')
40 'ASCII text (gzip compressed data, was "test", last modified: Sat Jun 28
41 21:32:52 2008, from Unix)'
42 ```
43
44 You can also combine the flag options:
45
46 ```python
47 >>> f = magic.Magic(mime=True, uncompress=True)
48 >>> f.from_file('testdata/test.gz')
49 'text/plain'
50 ```
51
52 ## Installation
53
54 The current stable version of python-magic is available on PyPI and
55 can be installed by running `pip install python-magic`.
56
57 Other sources:
58
59 - PyPI: http://pypi.python.org/pypi/python-magic/
60 - GitHub: https://github.com/ahupp/python-magic
61
62 This module is a simple wrapper around the libmagic C library, and
63 that must be installed as well:
64
65 ### Debian/Ubuntu
66
67 ```
68 sudo apt-get install libmagic1
69 ```
70
71 ### Windows
72
73 You'll need DLLs for libmagic. @julian-r maintains a pypi package with the DLLs, you can fetch it with:
74
75 ```
76 pip install python-magic-bin
77 ```
78
79 ### OSX
80
81 - When using Homebrew: `brew install libmagic`
82 - When using macports: `port install file`
83
84 ### Troubleshooting
85
86 - 'MagicException: could not find any magic files!': some
87 installations of libmagic do not correctly point to their magic
88 database file. Try specifying the path to the file explicitly in the
89 constructor: `magic.Magic(magic_file="path_to_magic_file")`.
90
91 - 'WindowsError: [Error 193] %1 is not a valid Win32 application':
92 Attempting to run the 32-bit libmagic DLL in a 64-bit build of
93 python will fail with this error. Here are 64-bit builds of libmagic for windows: https://github.com/pidydx/libmagicwin64.
94 Newer version can be found here: https://github.com/nscaife/file-windows.
95
96 - 'WindowsError: exception: access violation writing 0x00000000 ' This may indicate you are mixing
97 Windows Python and Cygwin Python. Make sure your libmagic and python builds are consistent.
98
99
100 ## Bug Reports
101
102 python-magic is a thin layer over the libmagic C library.
103 Historically, most bugs that have been reported against python-magic
104 are actually bugs in libmagic; libmagic bugs can be reported on their
105 tracker here: https://bugs.astron.com/my_view_page.php. If you're not
106 sure where the bug lies feel free to file an issue on GitHub and I can
107 triage it.
108
109 ## Running the tests
110
111 To run the tests across a variety of linux distributions (depends on Docker):
112
113 ```
114 ./test_docker.sh
115 ```
116
117 To run tests locally across all available python versions:
118
119 ```
120 ./test/run.py
121 ```
122
123 To run against a specific python version:
124
125 ```
126 LC_ALL=en_US.UTF-8 python3 test/test.py
127 ```
128
129 ## libmagic and python-magic
130
131 See [COMPAT.md](COMPAT.md) for a guide to libmagic / python-magic compatability.
132
133 ## Versioning
134
135 Minor version bumps should be backwards compatible. Major bumps are not.
136
137 ## Author
138
139 Written by Adam Hupp in 2001 for a project that never got off the
140 ground. It originally used SWIG for the C library bindings, but
141 switched to ctypes once that was part of the python standard library.
142
143 You can contact me via my [website](http://hupp.org/adam) or
144 [GitHub](http://github.com/ahupp).
145
146 ## License
147
148 python-magic is distributed under the MIT license. See the included
149 LICENSE file for details.
150
151 I am providing code in the repository to you under an open source license. Because this is my personal repository, the license you receive to my code is from me and not my employer (Facebook).
152
153 Keywords: mime magic file
154 Platform: UNKNOWN
155 Classifier: Intended Audience :: Developers
156 Classifier: License :: OSI Approved :: MIT License
157 Classifier: Programming Language :: Python
158 Classifier: Programming Language :: Python :: 2.7
159 Classifier: Programming Language :: Python :: 3
160 Classifier: Programming Language :: Python :: 3.5
161 Classifier: Programming Language :: Python :: 3.6
162 Classifier: Programming Language :: Python :: 3.7
163 Classifier: Programming Language :: Python :: 3.8
164 Classifier: Programming Language :: Python :: 3.9
165 Classifier: Programming Language :: Python :: Implementation :: CPython
166 Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*
167 Description-Content-Type: text/markdown
11 [![PyPI version](https://badge.fury.io/py/python-magic.svg)](https://badge.fury.io/py/python-magic)
22 [![Build Status](https://travis-ci.org/ahupp/python-magic.svg?branch=master)](https://travis-ci.org/ahupp/python-magic)
33
4 python-magic is a python interface to the libmagic file type
4 python-magic is a Python interface to the libmagic file type
55 identification library. libmagic identifies file types by checking
66 their headers according to a predefined list of file types. This
77 functionality is exposed to the command line by the Unix command
1313 >>> import magic
1414 >>> magic.from_file("testdata/test.pdf")
1515 'PDF document, version 1.2'
16 >>> magic.from_buffer(open("testdata/test.pdf").read(1024))
16 # recommend using at least the first 2048 bytes, as less can produce incorrect identification
17 >>> magic.from_buffer(open("testdata/test.pdf", "rb").read(2048))
1718 'PDF document, version 1.2'
1819 >>> magic.from_file("testdata/test.pdf", mime=True)
1920 'application/pdf'
4041 'text/plain'
4142 ```
4243
43 ## Name Conflict
44
45 There are, sadly, two libraries which use the module name `magic`. Both have been around for quite a while.If you are using this module and get an error using a method like `open`, your code is expecting the other one. Hopefully one day these will be reconciled.
46
4744 ## Installation
4845
49 The current stable version of python-magic is available on pypi and
46 The current stable version of python-magic is available on PyPI and
5047 can be installed by running `pip install python-magic`.
5148
5249 Other sources:
5350
54 - pypi: http://pypi.python.org/pypi/python-magic/
55 - github: https://github.com/ahupp/python-magic
51 - PyPI: http://pypi.python.org/pypi/python-magic/
52 - GitHub: https://github.com/ahupp/python-magic
5653
57 ### Dependencies
54 This module is a simple wrapper around the libmagic C library, and
55 that must be installed as well:
5856
59 On Windows, copy magic1.dll, regex2.dll, and zlib1.dll onto your PATH from the Binaries and Dependencies zipfiles provided by the [File for Windows](http://gnuwin32.sourceforge.net/packages/file.htm) project. You will need to copy the file `magic` out of `[binary-zip]\share\misc`, and pass it's location to `Magic(magic_file=...)`. If you are using a 64-bit build of python, you'll need 64-bit libmagic binaries which can be found here: https://github.com/pidydx/libmagicwin64 (note: untested)
57 ### Debian/Ubuntu
6058
61 On OSX:
59 ```
60 sudo apt-get install libmagic1
61 ```
62
63 ### Windows
64
65 You'll need DLLs for libmagic. @julian-r maintains a pypi package with the DLLs, you can fetch it with:
66
67 ```
68 pip install python-magic-bin
69 ```
70
71 ### OSX
6272
6373 - When using Homebrew: `brew install libmagic`
6474 - When using macports: `port install file`
7282
7383 - 'WindowsError: [Error 193] %1 is not a valid Win32 application':
7484 Attempting to run the 32-bit libmagic DLL in a 64-bit build of
75 python will fail with this error. Here are 64-bit builds of libmagic for windows: https://github.com/pidydx/libmagicwin64
85 python will fail with this error. Here are 64-bit builds of libmagic for windows: https://github.com/pidydx/libmagicwin64.
86 Newer version can be found here: https://github.com/nscaife/file-windows.
7687
77 - 'WindowsError: exception: access violation writing 0x00000000 ' This may indicate you are mixing
88 - 'WindowsError: exception: access violation writing 0x00000000 ' This may indicate you are mixing
7889 Windows Python and Cygwin Python. Make sure your libmagic and python builds are consistent.
90
91
92 ## Bug Reports
93
94 python-magic is a thin layer over the libmagic C library.
95 Historically, most bugs that have been reported against python-magic
96 are actually bugs in libmagic; libmagic bugs can be reported on their
97 tracker here: https://bugs.astron.com/my_view_page.php. If you're not
98 sure where the bug lies feel free to file an issue on GitHub and I can
99 triage it.
100
101 ## Running the tests
102
103 To run the tests across a variety of linux distributions (depends on Docker):
104
105 ```
106 ./test_docker.sh
107 ```
108
109 To run tests locally across all available python versions:
110
111 ```
112 ./test/run.py
113 ```
114
115 To run against a specific python version:
116
117 ```
118 LC_ALL=en_US.UTF-8 python3 test/test.py
119 ```
120
121 ## libmagic and python-magic
122
123 See [COMPAT.md](COMPAT.md) for a guide to libmagic / python-magic compatability.
124
125 ## Versioning
126
127 Minor version bumps should be backwards compatible. Major bumps are not.
79128
80129 ## Author
81130
84133 switched to ctypes once that was part of the python standard library.
85134
86135 You can contact me via my [website](http://hupp.org/adam) or
87 [github](http://github.com/ahupp).
88
89 ## Contributors
90
91 Thanks to these folks on github who submitted features and bugfixes.
92
93 - Amit Sethi
94 - [bigben87](https://github.com/bigben87)
95 - [fallgesetz](https://github.com/fallgesetz)
96 - [FlaPer87](https://github.com/FlaPer87)
97 - [lukenowak](https://github.com/lukenowak)
98 - NicolasDelaby
99 - [email protected]
100 - SimpleSeb
101 - [tehmaze](https://github.com/tehmaze)
136 [GitHub](http://github.com/ahupp).
102137
103138 ## License
104139
105140 python-magic is distributed under the MIT license. See the included
106141 LICENSE file for details.
107142
108
143 I am providing code in the repository to you under an open source license. Because this is my personal repository, the license you receive to my code is from me and not my employer (Facebook).
0 """
1 magic is a wrapper around the libmagic file identification library.
2
3 See README for more information.
4
5 Usage:
6
7 >>> import magic
8 >>> magic.from_file("testdata/test.pdf")
9 'PDF document, version 1.2'
10 >>> magic.from_file("testdata/test.pdf", mime=True)
11 'application/pdf'
12 >>> magic.from_buffer(open("testdata/test.pdf").read(1024))
13 'PDF document, version 1.2'
14 >>>
15
16 """
17
18 import sys
19 import glob
20 import ctypes
21 import ctypes.util
22 import threading
23 import logging
24
25 from ctypes import c_char_p, c_int, c_size_t, c_void_p, byref, POINTER
26
27 # avoid shadowing the real open with the version from compat.py
28 _real_open = open
29
30
31 class MagicException(Exception):
32 def __init__(self, message):
33 super(Exception, self).__init__(message)
34 self.message = message
35
36
37 class Magic:
38 """
39 Magic is a wrapper around the libmagic C library.
40 """
41
42 def __init__(self, mime=False, magic_file=None, mime_encoding=False,
43 keep_going=False, uncompress=False, raw=False, extension=False):
44 """
45 Create a new libmagic wrapper.
46
47 mime - if True, mimetypes are returned instead of textual descriptions
48 mime_encoding - if True, codec is returned
49 magic_file - use a mime database other than the system default
50 keep_going - don't stop at the first match, keep going
51 uncompress - Try to look inside compressed files.
52 raw - Do not try to decode "non-printable" chars.
53 extension - Print a slash-separated list of valid extensions for the file type found.
54 """
55
56 self.cookie = None
57 self.flags = MAGIC_NONE
58 if mime:
59 self.flags |= MAGIC_MIME_TYPE
60 if mime_encoding:
61 self.flags |= MAGIC_MIME_ENCODING
62 if keep_going:
63 self.flags |= MAGIC_CONTINUE
64 if uncompress:
65 self.flags |= MAGIC_COMPRESS
66 if raw:
67 self.flags |= MAGIC_RAW
68 if extension:
69 self.flags |= MAGIC_EXTENSION
70
71 self.cookie = magic_open(self.flags)
72 self.lock = threading.Lock()
73
74 magic_load(self.cookie, magic_file)
75
76 # MAGIC_EXTENSION was added in 523 or 524, so bail if
77 # it doesn't appear to be available
78 if extension and (not _has_version or version() < 524):
79 raise NotImplementedError('MAGIC_EXTENSION is not supported in this version of libmagic')
80
81 # For https://github.com/ahupp/python-magic/issues/190
82 # libmagic has fixed internal limits that some files exceed, causing
83 # an error. We can avoid this (at least for the sample file given)
84 # by bumping the limit up. It's not clear if this is a general solution
85 # or whether other internal limits should be increased, but given
86 # the lack of other reports I'll assume this is rare.
87 if _has_param:
88 try:
89 self.setparam(MAGIC_PARAM_NAME_MAX, 64)
90 except MagicException as e:
91 # some versions of libmagic fail this call,
92 # so rather than fail hard just use default behavior
93 pass
94
95 def from_buffer(self, buf):
96 """
97 Identify the contents of `buf`
98 """
99 with self.lock:
100 try:
101 # if we're on python3, convert buf to bytes
102 # otherwise this string is passed as wchar*
103 # which is not what libmagic expects
104 if type(buf) == str and str != bytes:
105 buf = buf.encode('utf-8', errors='replace')
106 return maybe_decode(magic_buffer(self.cookie, buf))
107 except MagicException as e:
108 return self._handle509Bug(e)
109
110 def from_file(self, filename):
111 # raise FileNotFoundException or IOError if the file does not exist
112 with _real_open(filename):
113 pass
114
115 with self.lock:
116 try:
117 return maybe_decode(magic_file(self.cookie, filename))
118 except MagicException as e:
119 return self._handle509Bug(e)
120
121 def from_descriptor(self, fd):
122 with self.lock:
123 try:
124 return maybe_decode(magic_descriptor(self.cookie, fd))
125 except MagicException as e:
126 return self._handle509Bug(e)
127
128 def _handle509Bug(self, e):
129 # libmagic 5.09 has a bug where it might fail to identify the
130 # mimetype of a file and returns null from magic_file (and
131 # likely _buffer), but also does not return an error message.
132 if e.message is None and (self.flags & MAGIC_MIME_TYPE):
133 return "application/octet-stream"
134 else:
135 raise e
136
137 def setparam(self, param, val):
138 return magic_setparam(self.cookie, param, val)
139
140 def getparam(self, param):
141 return magic_getparam(self.cookie, param)
142
143 def __del__(self):
144 # no _thread_check here because there can be no other
145 # references to this object at this point.
146
147 # during shutdown magic_close may have been cleared already so
148 # make sure it exists before using it.
149
150 # the self.cookie check should be unnecessary and was an
151 # incorrect fix for a threading problem, however I'm leaving
152 # it in because it's harmless and I'm slightly afraid to
153 # remove it.
154 if self.cookie and magic_close:
155 magic_close(self.cookie)
156 self.cookie = None
157
158
159 _instances = {}
160
161
162 def _get_magic_type(mime):
163 i = _instances.get(mime)
164 if i is None:
165 i = _instances[mime] = Magic(mime=mime)
166 return i
167
168
169 def from_file(filename, mime=False):
170 """"
171 Accepts a filename and returns the detected filetype. Return
172 value is the mimetype if mime=True, otherwise a human readable
173 name.
174
175 >>> magic.from_file("testdata/test.pdf", mime=True)
176 'application/pdf'
177 """
178 m = _get_magic_type(mime)
179 return m.from_file(filename)
180
181
182 def from_buffer(buffer, mime=False):
183 """
184 Accepts a binary string and returns the detected filetype. Return
185 value is the mimetype if mime=True, otherwise a human readable
186 name.
187
188 >>> magic.from_buffer(open("testdata/test.pdf").read(1024))
189 'PDF document, version 1.2'
190 """
191 m = _get_magic_type(mime)
192 return m.from_buffer(buffer)
193
194
195 def from_descriptor(fd, mime=False):
196 """
197 Accepts a file descriptor and returns the detected filetype. Return
198 value is the mimetype if mime=True, otherwise a human readable
199 name.
200
201 >>> f = open("testdata/test.pdf")
202 >>> magic.from_descriptor(f.fileno())
203 'PDF document, version 1.2'
204 """
205 m = _get_magic_type(mime)
206 return m.from_descriptor(fd)
207
208
209 libmagic = None
210 # Let's try to find magic or magic1
211 dll = ctypes.util.find_library('magic') \
212 or ctypes.util.find_library('magic1') \
213 or ctypes.util.find_library('cygmagic-1') \
214 or ctypes.util.find_library('libmagic-1') \
215 or ctypes.util.find_library('msys-magic-1') # for MSYS2
216
217 # necessary because find_library returns None if it doesn't find the library
218 if dll:
219 libmagic = ctypes.CDLL(dll)
220
221 if not libmagic or not libmagic._name:
222 windows_dlls = ['magic1.dll', 'cygmagic-1.dll', 'libmagic-1.dll', 'msys-magic-1.dll']
223 platform_to_lib = {'darwin': ['/opt/local/lib/libmagic.dylib',
224 '/usr/local/lib/libmagic.dylib'] +
225 # Assumes there will only be one version installed
226 glob.glob('/usr/local/Cellar/libmagic/*/lib/libmagic.dylib'), # flake8:noqa
227 'win32': windows_dlls,
228 'cygwin': windows_dlls,
229 'linux': ['libmagic.so.1'],
230 # fallback for some Linuxes (e.g. Alpine) where library search does not work # flake8:noqa
231 }
232 platform = 'linux' if sys.platform.startswith('linux') else sys.platform
233 for dll in platform_to_lib.get(platform, []):
234 try:
235 libmagic = ctypes.CDLL(dll)
236 break
237 except OSError:
238 pass
239
240 if not libmagic or not libmagic._name:
241 # It is better to raise an ImportError since we are importing magic module
242 raise ImportError('failed to find libmagic. Check your installation')
243
244 magic_t = ctypes.c_void_p
245
246
247 def errorcheck_null(result, func, args):
248 if result is None:
249 err = magic_error(args[0])
250 raise MagicException(err)
251 else:
252 return result
253
254
255 def errorcheck_negative_one(result, func, args):
256 if result == -1:
257 err = magic_error(args[0])
258 raise MagicException(err)
259 else:
260 return result
261
262
263 # return str on python3. Don't want to unconditionally
264 # decode because that results in unicode on python2
265 def maybe_decode(s):
266 if str == bytes:
267 return s
268 else:
269 # backslashreplace here because sometimes libmagic will return metadata in the charset
270 # of the file, which is unknown to us (e.g the title of a Word doc)
271 return s.decode('utf-8', 'backslashreplace')
272
273
274 def coerce_filename(filename):
275 if filename is None:
276 return None
277 # ctypes will implicitly convert unicode strings to bytes with
278 # .encode('ascii'). If you use the filesystem encoding
279 # then you'll get inconsistent behavior (crashes) depending on the user's
280 # LANG environment variable
281 is_unicode = (sys.version_info[0] <= 2 and
282 isinstance(filename, unicode)) or \
283 (sys.version_info[0] >= 3 and
284 isinstance(filename, str))
285 if is_unicode:
286 return filename.encode('utf-8', 'surrogateescape')
287 else:
288 return filename
289
290
291 magic_open = libmagic.magic_open
292 magic_open.restype = magic_t
293 magic_open.argtypes = [c_int]
294
295 magic_close = libmagic.magic_close
296 magic_close.restype = None
297 magic_close.argtypes = [magic_t]
298
299 magic_error = libmagic.magic_error
300 magic_error.restype = c_char_p
301 magic_error.argtypes = [magic_t]
302
303 magic_errno = libmagic.magic_errno
304 magic_errno.restype = c_int
305 magic_errno.argtypes = [magic_t]
306
307 _magic_file = libmagic.magic_file
308 _magic_file.restype = c_char_p
309 _magic_file.argtypes = [magic_t, c_char_p]
310 _magic_file.errcheck = errorcheck_null
311
312
313 def magic_file(cookie, filename):
314 return _magic_file(cookie, coerce_filename(filename))
315
316
317 _magic_buffer = libmagic.magic_buffer
318 _magic_buffer.restype = c_char_p
319 _magic_buffer.argtypes = [magic_t, c_void_p, c_size_t]
320 _magic_buffer.errcheck = errorcheck_null
321
322
323 def magic_buffer(cookie, buf):
324 return _magic_buffer(cookie, buf, len(buf))
325
326
327 magic_descriptor = libmagic.magic_descriptor
328 magic_descriptor.restype = c_char_p
329 magic_descriptor.argtypes = [magic_t, c_int]
330 magic_descriptor.errcheck = errorcheck_null
331
332 _magic_descriptor = libmagic.magic_descriptor
333 _magic_descriptor.restype = c_char_p
334 _magic_descriptor.argtypes = [magic_t, c_int]
335 _magic_descriptor.errcheck = errorcheck_null
336
337
338 def magic_descriptor(cookie, fd):
339 return _magic_descriptor(cookie, fd)
340
341
342 _magic_load = libmagic.magic_load
343 _magic_load.restype = c_int
344 _magic_load.argtypes = [magic_t, c_char_p]
345 _magic_load.errcheck = errorcheck_negative_one
346
347
348 def magic_load(cookie, filename):
349 return _magic_load(cookie, coerce_filename(filename))
350
351
352 magic_setflags = libmagic.magic_setflags
353 magic_setflags.restype = c_int
354 magic_setflags.argtypes = [magic_t, c_int]
355
356 magic_check = libmagic.magic_check
357 magic_check.restype = c_int
358 magic_check.argtypes = [magic_t, c_char_p]
359
360 magic_compile = libmagic.magic_compile
361 magic_compile.restype = c_int
362 magic_compile.argtypes = [magic_t, c_char_p]
363
364 _has_param = False
365 if hasattr(libmagic, 'magic_setparam') and hasattr(libmagic, 'magic_getparam'):
366 _has_param = True
367 _magic_setparam = libmagic.magic_setparam
368 _magic_setparam.restype = c_int
369 _magic_setparam.argtypes = [magic_t, c_int, POINTER(c_size_t)]
370 _magic_setparam.errcheck = errorcheck_negative_one
371
372 _magic_getparam = libmagic.magic_getparam
373 _magic_getparam.restype = c_int
374 _magic_getparam.argtypes = [magic_t, c_int, POINTER(c_size_t)]
375 _magic_getparam.errcheck = errorcheck_negative_one
376
377
378 def magic_setparam(cookie, param, val):
379 if not _has_param:
380 raise NotImplementedError("magic_setparam not implemented")
381 v = c_size_t(val)
382 return _magic_setparam(cookie, param, byref(v))
383
384
385 def magic_getparam(cookie, param):
386 if not _has_param:
387 raise NotImplementedError("magic_getparam not implemented")
388 val = c_size_t()
389 _magic_getparam(cookie, param, byref(val))
390 return val.value
391
392
393 _has_version = False
394 if hasattr(libmagic, "magic_version"):
395 _has_version = True
396 magic_version = libmagic.magic_version
397 magic_version.restype = c_int
398 magic_version.argtypes = []
399
400
401 def version():
402 if not _has_version:
403 raise NotImplementedError("magic_version not implemented")
404 return magic_version()
405
406
407 MAGIC_NONE = 0x000000 # No flags
408 MAGIC_DEBUG = 0x000001 # Turn on debugging
409 MAGIC_SYMLINK = 0x000002 # Follow symlinks
410 MAGIC_COMPRESS = 0x000004 # Check inside compressed files
411 MAGIC_DEVICES = 0x000008 # Look at the contents of devices
412 MAGIC_MIME_TYPE = 0x000010 # Return a mime string
413 MAGIC_MIME_ENCODING = 0x000400 # Return the MIME encoding
414 # TODO: should be
415 # MAGIC_MIME = MAGIC_MIME_TYPE | MAGIC_MIME_ENCODING
416 MAGIC_MIME = 0x000010 # Return a mime string
417 MAGIC_EXTENSION = 0x1000000 # Return a /-separated list of extensions
418
419 MAGIC_CONTINUE = 0x000020 # Return all matches
420 MAGIC_CHECK = 0x000040 # Print warnings to stderr
421 MAGIC_PRESERVE_ATIME = 0x000080 # Restore access time on exit
422 MAGIC_RAW = 0x000100 # Don't translate unprintable chars
423 MAGIC_ERROR = 0x000200 # Handle ENOENT etc as real errors
424
425 MAGIC_NO_CHECK_COMPRESS = 0x001000 # Don't check for compressed files
426 MAGIC_NO_CHECK_TAR = 0x002000 # Don't check for tar files
427 MAGIC_NO_CHECK_SOFT = 0x004000 # Don't check magic entries
428 MAGIC_NO_CHECK_APPTYPE = 0x008000 # Don't check application type
429 MAGIC_NO_CHECK_ELF = 0x010000 # Don't check for elf details
430 MAGIC_NO_CHECK_ASCII = 0x020000 # Don't check for ascii files
431 MAGIC_NO_CHECK_TROFF = 0x040000 # Don't check ascii/troff
432 MAGIC_NO_CHECK_FORTRAN = 0x080000 # Don't check ascii/fortran
433 MAGIC_NO_CHECK_TOKENS = 0x100000 # Don't check ascii/tokens
434
435 MAGIC_PARAM_INDIR_MAX = 0 # Recursion limit for indirect magic
436 MAGIC_PARAM_NAME_MAX = 1 # Use count limit for name/use magic
437 MAGIC_PARAM_ELF_PHNUM_MAX = 2 # Max ELF notes processed
438 MAGIC_PARAM_ELF_SHNUM_MAX = 3 # Max ELF program sections processed
439 MAGIC_PARAM_ELF_NOTES_MAX = 4 # # Max ELF sections processed
440 MAGIC_PARAM_REGEX_MAX = 5 # Length limit for regex searches
441 MAGIC_PARAM_BYTES_MAX = 6 # Max number of bytes to read from file
442
443
444 # This package name conflicts with the one provided by upstream
445 # libmagic. This is a common source of confusion for users. To
446 # resolve, We ship a copy of that module, and expose it's functions
447 # wrapped in deprecation warnings.
448 def _add_compat(to_module):
449 import warnings, re
450 from magic import compat
451
452 def deprecation_wrapper(fn):
453 def _(*args, **kwargs):
454 warnings.warn(
455 "Using compatability mode with libmagic's python binding. "
456 "See https://github.com/ahupp/python-magic/blob/master/COMPAT.md for details.",
457 PendingDeprecationWarning)
458
459 return fn(*args, **kwargs)
460
461 return _
462
463 fn = ['detect_from_filename',
464 'detect_from_content',
465 'detect_from_fobj',
466 'open']
467 for fname in fn:
468 to_module[fname] = deprecation_wrapper(compat.__dict__[fname])
469
470 # copy constants over, ensuring there's no conflicts
471 is_const_re = re.compile("^[A-Z_]+$")
472 allowed_inconsistent = set(['MAGIC_MIME'])
473 for name, value in compat.__dict__.items():
474 if is_const_re.match(name):
475 if name in to_module:
476 if name in allowed_inconsistent:
477 continue
478 if to_module[name] != value:
479 raise Exception("inconsistent value for " + name)
480 else:
481 continue
482 else:
483 to_module[name] = value
484
485
486 _add_compat(globals())
0 # coding: utf-8
1
2 '''
3 Python bindings for libmagic
4 '''
5
6 import ctypes
7
8 from collections import namedtuple
9
10 from ctypes import *
11 from ctypes.util import find_library
12
13
14 def _init():
15 """
16 Loads the shared library through ctypes and returns a library
17 L{ctypes.CDLL} instance
18 """
19 return ctypes.cdll.LoadLibrary(find_library('magic'))
20
21
22 _libraries = {}
23 _libraries['magic'] = _init()
24
25 # Flag constants for open and setflags
26 MAGIC_NONE = NONE = 0
27 MAGIC_DEBUG = DEBUG = 1
28 MAGIC_SYMLINK = SYMLINK = 2
29 MAGIC_COMPRESS = COMPRESS = 4
30 MAGIC_DEVICES = DEVICES = 8
31 MAGIC_MIME_TYPE = MIME_TYPE = 16
32 MAGIC_CONTINUE = CONTINUE = 32
33 MAGIC_CHECK = CHECK = 64
34 MAGIC_PRESERVE_ATIME = PRESERVE_ATIME = 128
35 MAGIC_RAW = RAW = 256
36 MAGIC_ERROR = ERROR = 512
37 MAGIC_MIME_ENCODING = MIME_ENCODING = 1024
38 MAGIC_MIME = MIME = 1040 # MIME_TYPE + MIME_ENCODING
39 MAGIC_APPLE = APPLE = 2048
40
41 MAGIC_NO_CHECK_COMPRESS = NO_CHECK_COMPRESS = 4096
42 MAGIC_NO_CHECK_TAR = NO_CHECK_TAR = 8192
43 MAGIC_NO_CHECK_SOFT = NO_CHECK_SOFT = 16384
44 MAGIC_NO_CHECK_APPTYPE = NO_CHECK_APPTYPE = 32768
45 MAGIC_NO_CHECK_ELF = NO_CHECK_ELF = 65536
46 MAGIC_NO_CHECK_TEXT = NO_CHECK_TEXT = 131072
47 MAGIC_NO_CHECK_CDF = NO_CHECK_CDF = 262144
48 MAGIC_NO_CHECK_TOKENS = NO_CHECK_TOKENS = 1048576
49 MAGIC_NO_CHECK_ENCODING = NO_CHECK_ENCODING = 2097152
50
51 MAGIC_NO_CHECK_BUILTIN = NO_CHECK_BUILTIN = 4173824
52
53 FileMagic = namedtuple('FileMagic', ('mime_type', 'encoding', 'name'))
54
55
56 class magic_set(Structure):
57 pass
58
59
60 magic_set._fields_ = []
61 magic_t = POINTER(magic_set)
62
63 _open = _libraries['magic'].magic_open
64 _open.restype = magic_t
65 _open.argtypes = [c_int]
66
67 _close = _libraries['magic'].magic_close
68 _close.restype = None
69 _close.argtypes = [magic_t]
70
71 _file = _libraries['magic'].magic_file
72 _file.restype = c_char_p
73 _file.argtypes = [magic_t, c_char_p]
74
75 _descriptor = _libraries['magic'].magic_descriptor
76 _descriptor.restype = c_char_p
77 _descriptor.argtypes = [magic_t, c_int]
78
79 _buffer = _libraries['magic'].magic_buffer
80 _buffer.restype = c_char_p
81 _buffer.argtypes = [magic_t, c_void_p, c_size_t]
82
83 _error = _libraries['magic'].magic_error
84 _error.restype = c_char_p
85 _error.argtypes = [magic_t]
86
87 _setflags = _libraries['magic'].magic_setflags
88 _setflags.restype = c_int
89 _setflags.argtypes = [magic_t, c_int]
90
91 _load = _libraries['magic'].magic_load
92 _load.restype = c_int
93 _load.argtypes = [magic_t, c_char_p]
94
95 _compile = _libraries['magic'].magic_compile
96 _compile.restype = c_int
97 _compile.argtypes = [magic_t, c_char_p]
98
99 _check = _libraries['magic'].magic_check
100 _check.restype = c_int
101 _check.argtypes = [magic_t, c_char_p]
102
103 _list = _libraries['magic'].magic_list
104 _list.restype = c_int
105 _list.argtypes = [magic_t, c_char_p]
106
107 _errno = _libraries['magic'].magic_errno
108 _errno.restype = c_int
109 _errno.argtypes = [magic_t]
110
111
112 class Magic(object):
113 def __init__(self, ms):
114 self._magic_t = ms
115
116 def close(self):
117 """
118 Closes the magic database and deallocates any resources used.
119 """
120 _close(self._magic_t)
121
122 @staticmethod
123 def __tostr(s):
124 if s is None:
125 return None
126 if isinstance(s, str):
127 return s
128 try: # keep Python 2 compatibility
129 return str(s, 'utf-8')
130 except TypeError:
131 return str(s)
132
133 @staticmethod
134 def __tobytes(b):
135 if b is None:
136 return None
137 if isinstance(b, bytes):
138 return b
139 try: # keep Python 2 compatibility
140 return bytes(b, 'utf-8')
141 except TypeError:
142 return bytes(b)
143
144 def file(self, filename):
145 """
146 Returns a textual description of the contents of the argument passed
147 as a filename or None if an error occurred and the MAGIC_ERROR flag
148 is set. A call to errno() will return the numeric error code.
149 """
150 return Magic.__tostr(_file(self._magic_t, Magic.__tobytes(filename)))
151
152 def descriptor(self, fd):
153 """
154 Returns a textual description of the contents of the argument passed
155 as a file descriptor or None if an error occurred and the MAGIC_ERROR
156 flag is set. A call to errno() will return the numeric error code.
157 """
158 return Magic.__tostr(_descriptor(self._magic_t, fd))
159
160 def buffer(self, buf):
161 """
162 Returns a textual description of the contents of the argument passed
163 as a buffer or None if an error occurred and the MAGIC_ERROR flag
164 is set. A call to errno() will return the numeric error code.
165 """
166 return Magic.__tostr(_buffer(self._magic_t, buf, len(buf)))
167
168 def error(self):
169 """
170 Returns a textual explanation of the last error or None
171 if there was no error.
172 """
173 return Magic.__tostr(_error(self._magic_t))
174
175 def setflags(self, flags):
176 """
177 Set flags on the magic object which determine how magic checking
178 behaves; a bitwise OR of the flags described in libmagic(3), but
179 without the MAGIC_ prefix.
180
181 Returns -1 on systems that don't support utime(2) or utimes(2)
182 when PRESERVE_ATIME is set.
183 """
184 return _setflags(self._magic_t, flags)
185
186 def load(self, filename=None):
187 """
188 Must be called to load entries in the colon separated list of database
189 files passed as argument or the default database file if no argument
190 before any magic queries can be performed.
191
192 Returns 0 on success and -1 on failure.
193 """
194 return _load(self._magic_t, Magic.__tobytes(filename))
195
196 def compile(self, dbs):
197 """
198 Compile entries in the colon separated list of database files
199 passed as argument or the default database file if no argument.
200 The compiled files created are named from the basename(1) of each file
201 argument with ".mgc" appended to it.
202
203 Returns 0 on success and -1 on failure.
204 """
205 return _compile(self._magic_t, Magic.__tobytes(dbs))
206
207 def check(self, dbs):
208 """
209 Check the validity of entries in the colon separated list of
210 database files passed as argument or the default database file
211 if no argument.
212
213 Returns 0 on success and -1 on failure.
214 """
215 return _check(self._magic_t, Magic.__tobytes(dbs))
216
217 def list(self, dbs):
218 """
219 Check the validity of entries in the colon separated list of
220 database files passed as argument or the default database file
221 if no argument.
222
223 Returns 0 on success and -1 on failure.
224 """
225 return _list(self._magic_t, Magic.__tobytes(dbs))
226
227 def errno(self):
228 """
229 Returns a numeric error code. If return value is 0, an internal
230 magic error occurred. If return value is non-zero, the value is
231 an OS error code. Use the errno module or os.strerror() can be used
232 to provide detailed error information.
233 """
234 return _errno(self._magic_t)
235
236
237 def open(flags):
238 """
239 Returns a magic object on success and None on failure.
240 Flags argument as for setflags.
241 """
242 return Magic(_open(flags))
243
244
245 # Objects used by `detect_from_` functions
246 mime_magic = Magic(_open(MAGIC_MIME))
247 mime_magic.load()
248 none_magic = Magic(_open(MAGIC_NONE))
249 none_magic.load()
250
251
252 def _create_filemagic(mime_detected, type_detected):
253 mime_type, mime_encoding = mime_detected.split('; ')
254
255 return FileMagic(name=type_detected, mime_type=mime_type,
256 encoding=mime_encoding.replace('charset=', ''))
257
258
259 def detect_from_filename(filename):
260 '''Detect mime type, encoding and file type from a filename
261
262 Returns a `FileMagic` namedtuple.
263 '''
264
265 return _create_filemagic(mime_magic.file(filename),
266 none_magic.file(filename))
267
268
269 def detect_from_fobj(fobj):
270 '''Detect mime type, encoding and file type from file-like object
271
272 Returns a `FileMagic` namedtuple.
273 '''
274
275 file_descriptor = fobj.fileno()
276 return _create_filemagic(mime_magic.descriptor(file_descriptor),
277 none_magic.descriptor(file_descriptor))
278
279
280 def detect_from_content(byte_content):
281 '''Detect mime type, encoding and file type from bytes
282
283 Returns a `FileMagic` namedtuple.
284 '''
285
286 return _create_filemagic(mime_magic.buffer(byte_content),
287 none_magic.buffer(byte_content))
+0
-296
magic.py less more
0 """
1 magic is a wrapper around the libmagic file identification library.
2
3 See README for more information.
4
5 Usage:
6
7 >>> import magic
8 >>> magic.from_file("testdata/test.pdf")
9 'PDF document, version 1.2'
10 >>> magic.from_file("testdata/test.pdf", mime=True)
11 'application/pdf'
12 >>> magic.from_buffer(open("testdata/test.pdf").read(1024))
13 'PDF document, version 1.2'
14 >>>
15
16
17 """
18
19 import sys
20 import glob
21 import os.path
22 import ctypes
23 import ctypes.util
24 import threading
25
26 from ctypes import c_char_p, c_int, c_size_t, c_void_p
27
28
29 class MagicException(Exception):
30 def __init__(self, message):
31 super(MagicException, self).__init__(message)
32 self.message = message
33
34
35 class Magic:
36 """
37 Magic is a wrapper around the libmagic C library.
38
39 """
40
41 def __init__(self, mime=False, magic_file=None, mime_encoding=False,
42 keep_going=False, uncompress=False):
43 """
44 Create a new libmagic wrapper.
45
46 mime - if True, mimetypes are returned instead of textual descriptions
47 mime_encoding - if True, codec is returned
48 magic_file - use a mime database other than the system default
49 keep_going - don't stop at the first match, keep going
50 uncompress - Try to look inside compressed files.
51 """
52 self.flags = MAGIC_NONE
53 if mime:
54 self.flags |= MAGIC_MIME
55 if mime_encoding:
56 self.flags |= MAGIC_MIME_ENCODING
57 if keep_going:
58 self.flags |= MAGIC_CONTINUE
59
60 if uncompress:
61 self.flags |= MAGIC_COMPRESS
62
63 self.cookie = magic_open(self.flags)
64 self.lock = threading.Lock()
65
66 magic_load(self.cookie, magic_file)
67
68 def from_buffer(self, buf):
69 """
70 Identify the contents of `buf`
71 """
72 with self.lock:
73 try:
74 return maybe_decode(magic_buffer(self.cookie, buf))
75 except MagicException as e:
76 return self._handle509Bug(e)
77
78 def from_file(self, filename):
79 # raise FileNotFoundException or IOError if the file does not exist
80 with open(filename):
81 pass
82 with self.lock:
83 try:
84 return maybe_decode(magic_file(self.cookie, filename))
85 except MagicException as e:
86 return self._handle509Bug(e)
87
88 def _handle509Bug(self, e):
89 # libmagic 5.09 has a bug where it might fail to identify the
90 # mimetype of a file and returns null from magic_file (and
91 # likely _buffer), but also does not return an error message.
92 if e.message is None and (self.flags & MAGIC_MIME):
93 return "application/octet-stream"
94 else:
95 raise e
96
97 def __del__(self):
98 # no _thread_check here because there can be no other
99 # references to this object at this point.
100
101 # during shutdown magic_close may have been cleared already so
102 # make sure it exists before using it.
103
104 # the self.cookie check should be unnecessary and was an
105 # incorrect fix for a threading problem, however I'm leaving
106 # it in because it's harmless and I'm slightly afraid to
107 # remove it.
108 if self.cookie and magic_close:
109 magic_close(self.cookie)
110 self.cookie = None
111
112 _instances = {}
113
114 def _get_magic_type(mime):
115 i = _instances.get(mime)
116 if i is None:
117 i = _instances[mime] = Magic(mime=mime)
118 return i
119
120 def from_file(filename, mime=False):
121 """"
122 Accepts a filename and returns the detected filetype. Return
123 value is the mimetype if mime=True, otherwise a human readable
124 name.
125
126 >>> magic.from_file("testdata/test.pdf", mime=True)
127 'application/pdf'
128 """
129 m = _get_magic_type(mime)
130 return m.from_file(filename)
131
132 def from_buffer(buffer, mime=False):
133 """
134 Accepts a binary string and returns the detected filetype. Return
135 value is the mimetype if mime=True, otherwise a human readable
136 name.
137
138 >>> magic.from_buffer(open("testdata/test.pdf").read(1024))
139 'PDF document, version 1.2'
140 """
141 m = _get_magic_type(mime)
142 return m.from_buffer(buffer)
143
144
145
146
147 libmagic = None
148 # Let's try to find magic or magic1
149 dll = ctypes.util.find_library('magic') or ctypes.util.find_library('magic1') or ctypes.util.find_library('cygmagic-1')
150
151 # This is necessary because find_library returns None if it doesn't find the library
152 if dll:
153 libmagic = ctypes.CDLL(dll)
154
155 if not libmagic or not libmagic._name:
156 windows_dlls = ['magic1.dll','cygmagic-1.dll']
157 platform_to_lib = {'darwin': ['/opt/local/lib/libmagic.dylib',
158 '/usr/local/lib/libmagic.dylib'] +
159 # Assumes there will only be one version installed
160 glob.glob('/usr/local/Cellar/libmagic/*/lib/libmagic.dylib'),
161 'win32': windows_dlls,
162 'cygwin': windows_dlls,
163 'linux': ['libmagic.so.1'], # fallback for some Linuxes (e.g. Alpine) where library search does not work
164 }
165 platform = 'linux' if sys.platform.startswith('linux') else sys.platform
166 for dll in platform_to_lib.get(platform, []):
167 try:
168 libmagic = ctypes.CDLL(dll)
169 break
170 except OSError:
171 pass
172
173 if not libmagic or not libmagic._name:
174 # It is better to raise an ImportError since we are importing magic module
175 raise ImportError('failed to find libmagic. Check your installation')
176
177 magic_t = ctypes.c_void_p
178
179 def errorcheck_null(result, func, args):
180 if result is None:
181 err = magic_error(args[0])
182 raise MagicException(err)
183 else:
184 return result
185
186 def errorcheck_negative_one(result, func, args):
187 if result is -1:
188 err = magic_error(args[0])
189 raise MagicException(err)
190 else:
191 return result
192
193
194 # return str on python3. Don't want to unconditionally
195 # decode because that results in unicode on python2
196 def maybe_decode(s):
197 if str == bytes:
198 return s
199 else:
200 return s.decode('utf-8')
201
202 def coerce_filename(filename):
203 if filename is None:
204 return None
205
206 # ctypes will implicitly convert unicode strings to bytes with
207 # .encode('ascii'). If you use the filesystem encoding
208 # then you'll get inconsistent behavior (crashes) depending on the user's
209 # LANG environment variable
210 is_unicode = (sys.version_info[0] <= 2 and
211 isinstance(filename, unicode)) or \
212 (sys.version_info[0] >= 3 and
213 isinstance(filename, str))
214 if is_unicode:
215 return filename.encode('utf-8')
216 else:
217 return filename
218
219 magic_open = libmagic.magic_open
220 magic_open.restype = magic_t
221 magic_open.argtypes = [c_int]
222
223 magic_close = libmagic.magic_close
224 magic_close.restype = None
225 magic_close.argtypes = [magic_t]
226
227 magic_error = libmagic.magic_error
228 magic_error.restype = c_char_p
229 magic_error.argtypes = [magic_t]
230
231 magic_errno = libmagic.magic_errno
232 magic_errno.restype = c_int
233 magic_errno.argtypes = [magic_t]
234
235 _magic_file = libmagic.magic_file
236 _magic_file.restype = c_char_p
237 _magic_file.argtypes = [magic_t, c_char_p]
238 _magic_file.errcheck = errorcheck_null
239
240 def magic_file(cookie, filename):
241 return _magic_file(cookie, coerce_filename(filename))
242
243 _magic_buffer = libmagic.magic_buffer
244 _magic_buffer.restype = c_char_p
245 _magic_buffer.argtypes = [magic_t, c_void_p, c_size_t]
246 _magic_buffer.errcheck = errorcheck_null
247
248 def magic_buffer(cookie, buf):
249 return _magic_buffer(cookie, buf, len(buf))
250
251
252 _magic_load = libmagic.magic_load
253 _magic_load.restype = c_int
254 _magic_load.argtypes = [magic_t, c_char_p]
255 _magic_load.errcheck = errorcheck_negative_one
256
257 def magic_load(cookie, filename):
258 return _magic_load(cookie, coerce_filename(filename))
259
260 magic_setflags = libmagic.magic_setflags
261 magic_setflags.restype = c_int
262 magic_setflags.argtypes = [magic_t, c_int]
263
264 magic_check = libmagic.magic_check
265 magic_check.restype = c_int
266 magic_check.argtypes = [magic_t, c_char_p]
267
268 magic_compile = libmagic.magic_compile
269 magic_compile.restype = c_int
270 magic_compile.argtypes = [magic_t, c_char_p]
271
272
273
274 MAGIC_NONE = 0x000000 # No flags
275 MAGIC_DEBUG = 0x000001 # Turn on debugging
276 MAGIC_SYMLINK = 0x000002 # Follow symlinks
277 MAGIC_COMPRESS = 0x000004 # Check inside compressed files
278 MAGIC_DEVICES = 0x000008 # Look at the contents of devices
279 MAGIC_MIME = 0x000010 # Return a mime string
280 MAGIC_MIME_ENCODING = 0x000400 # Return the MIME encoding
281 MAGIC_CONTINUE = 0x000020 # Return all matches
282 MAGIC_CHECK = 0x000040 # Print warnings to stderr
283 MAGIC_PRESERVE_ATIME = 0x000080 # Restore access time on exit
284 MAGIC_RAW = 0x000100 # Don't translate unprintable chars
285 MAGIC_ERROR = 0x000200 # Handle ENOENT etc as real errors
286
287 MAGIC_NO_CHECK_COMPRESS = 0x001000 # Don't check for compressed files
288 MAGIC_NO_CHECK_TAR = 0x002000 # Don't check for tar files
289 MAGIC_NO_CHECK_SOFT = 0x004000 # Don't check magic entries
290 MAGIC_NO_CHECK_APPTYPE = 0x008000 # Don't check application type
291 MAGIC_NO_CHECK_ELF = 0x010000 # Don't check for elf details
292 MAGIC_NO_CHECK_ASCII = 0x020000 # Don't check for ascii files
293 MAGIC_NO_CHECK_TROFF = 0x040000 # Don't check ascii/troff
294 MAGIC_NO_CHECK_FORTRAN = 0x080000 # Don't check ascii/fortran
295 MAGIC_NO_CHECK_TOKENS = 0x100000 # Don't check ascii/tokens
0 Metadata-Version: 2.1
1 Name: python-magic
2 Version: 0.4.20
3 Summary: File type identification using libmagic
4 Home-page: http://github.com/ahupp/python-magic
5 Author: Adam Hupp
6 Author-email: [email protected]
7 License: MIT
8 Description: # python-magic
9 [![PyPI version](https://badge.fury.io/py/python-magic.svg)](https://badge.fury.io/py/python-magic)
10 [![Build Status](https://travis-ci.org/ahupp/python-magic.svg?branch=master)](https://travis-ci.org/ahupp/python-magic)
11
12 python-magic is a Python interface to the libmagic file type
13 identification library. libmagic identifies file types by checking
14 their headers according to a predefined list of file types. This
15 functionality is exposed to the command line by the Unix command
16 `file`.
17
18 ## Usage
19
20 ```python
21 >>> import magic
22 >>> magic.from_file("testdata/test.pdf")
23 'PDF document, version 1.2'
24 # recommend using at least the first 2048 bytes, as less can produce incorrect identification
25 >>> magic.from_buffer(open("testdata/test.pdf", "rb").read(2048))
26 'PDF document, version 1.2'
27 >>> magic.from_file("testdata/test.pdf", mime=True)
28 'application/pdf'
29 ```
30
31 There is also a `Magic` class that provides more direct control,
32 including overriding the magic database file and turning on character
33 encoding detection. This is not recommended for general use. In
34 particular, it's not safe for sharing across multiple threads and
35 will fail throw if this is attempted.
36
37 ```python
38 >>> f = magic.Magic(uncompress=True)
39 >>> f.from_file('testdata/test.gz')
40 'ASCII text (gzip compressed data, was "test", last modified: Sat Jun 28
41 21:32:52 2008, from Unix)'
42 ```
43
44 You can also combine the flag options:
45
46 ```python
47 >>> f = magic.Magic(mime=True, uncompress=True)
48 >>> f.from_file('testdata/test.gz')
49 'text/plain'
50 ```
51
52 ## Installation
53
54 The current stable version of python-magic is available on PyPI and
55 can be installed by running `pip install python-magic`.
56
57 Other sources:
58
59 - PyPI: http://pypi.python.org/pypi/python-magic/
60 - GitHub: https://github.com/ahupp/python-magic
61
62 This module is a simple wrapper around the libmagic C library, and
63 that must be installed as well:
64
65 ### Debian/Ubuntu
66
67 ```
68 sudo apt-get install libmagic1
69 ```
70
71 ### Windows
72
73 You'll need DLLs for libmagic. @julian-r maintains a pypi package with the DLLs, you can fetch it with:
74
75 ```
76 pip install python-magic-bin
77 ```
78
79 ### OSX
80
81 - When using Homebrew: `brew install libmagic`
82 - When using macports: `port install file`
83
84 ### Troubleshooting
85
86 - 'MagicException: could not find any magic files!': some
87 installations of libmagic do not correctly point to their magic
88 database file. Try specifying the path to the file explicitly in the
89 constructor: `magic.Magic(magic_file="path_to_magic_file")`.
90
91 - 'WindowsError: [Error 193] %1 is not a valid Win32 application':
92 Attempting to run the 32-bit libmagic DLL in a 64-bit build of
93 python will fail with this error. Here are 64-bit builds of libmagic for windows: https://github.com/pidydx/libmagicwin64.
94 Newer version can be found here: https://github.com/nscaife/file-windows.
95
96 - 'WindowsError: exception: access violation writing 0x00000000 ' This may indicate you are mixing
97 Windows Python and Cygwin Python. Make sure your libmagic and python builds are consistent.
98
99
100 ## Bug Reports
101
102 python-magic is a thin layer over the libmagic C library.
103 Historically, most bugs that have been reported against python-magic
104 are actually bugs in libmagic; libmagic bugs can be reported on their
105 tracker here: https://bugs.astron.com/my_view_page.php. If you're not
106 sure where the bug lies feel free to file an issue on GitHub and I can
107 triage it.
108
109 ## Running the tests
110
111 To run the tests across a variety of linux distributions (depends on Docker):
112
113 ```
114 ./test_docker.sh
115 ```
116
117 To run tests locally across all available python versions:
118
119 ```
120 ./test/run.py
121 ```
122
123 To run against a specific python version:
124
125 ```
126 LC_ALL=en_US.UTF-8 python3 test/test.py
127 ```
128
129 ## libmagic and python-magic
130
131 See [COMPAT.md](COMPAT.md) for a guide to libmagic / python-magic compatability.
132
133 ## Versioning
134
135 Minor version bumps should be backwards compatible. Major bumps are not.
136
137 ## Author
138
139 Written by Adam Hupp in 2001 for a project that never got off the
140 ground. It originally used SWIG for the C library bindings, but
141 switched to ctypes once that was part of the python standard library.
142
143 You can contact me via my [website](http://hupp.org/adam) or
144 [GitHub](http://github.com/ahupp).
145
146 ## License
147
148 python-magic is distributed under the MIT license. See the included
149 LICENSE file for details.
150
151 I am providing code in the repository to you under an open source license. Because this is my personal repository, the license you receive to my code is from me and not my employer (Facebook).
152
153 Keywords: mime magic file
154 Platform: UNKNOWN
155 Classifier: Intended Audience :: Developers
156 Classifier: License :: OSI Approved :: MIT License
157 Classifier: Programming Language :: Python
158 Classifier: Programming Language :: Python :: 2.7
159 Classifier: Programming Language :: Python :: 3
160 Classifier: Programming Language :: Python :: 3.5
161 Classifier: Programming Language :: Python :: 3.6
162 Classifier: Programming Language :: Python :: 3.7
163 Classifier: Programming Language :: Python :: 3.8
164 Classifier: Programming Language :: Python :: 3.9
165 Classifier: Programming Language :: Python :: Implementation :: CPython
166 Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*
167 Description-Content-Type: text/markdown
0 LICENSE
1 MANIFEST.in
2 README.md
3 __init__.py
4 setup.cfg
5 setup.py
6 magic/__init__.py
7 magic/compat.py
8 python_magic.egg-info/PKG-INFO
9 python_magic.egg-info/SOURCES.txt
10 python_magic.egg-info/dependency_links.txt
11 python_magic.egg-info/top_level.txt
12 test/__init__.py
13 test/libmagic_test.py
14 test/run.py
15 test/test.py
00 [global]
1 command_packages=stdeb.command
1 command_packages = stdeb.command
22
33 [bdist_wheel]
44 universal = 1
5
6 [egg_info]
7 tag_build =
8 tag_date = 0
9
00 #!/usr/bin/env python
11 # -*- coding: utf-8 -*-
22
3 from setuptools import setup
3 import setuptools
4 import io
5 import os
46
5 setup(name='python-magic',
6 description='File type identification using libmagic',
7 author='Adam Hupp',
8 author_email='[email protected]',
9 url="http://github.com/ahupp/python-magic",
10 version='0.4.13',
11 py_modules=['magic'],
12 long_description="""This module uses ctypes to access the libmagic file type
13 identification library. It makes use of the local magic database and
14 supports both textual and MIME-type output.
15 """,
16 keywords="mime magic file",
17 license="MIT",
18 test_suite='test',
19 classifiers=[
20 'Intended Audience :: Developers',
21 'License :: OSI Approved :: MIT License',
22 'Programming Language :: Python',
23 'Programming Language :: Python :: 2',
24 'Programming Language :: Python :: 3',
25 ],
26 )
7
8 def read(file_name):
9 """Read a text file and return the content as a string."""
10 with io.open(os.path.join(os.path.dirname(__file__), file_name),
11 encoding='utf-8') as f:
12 return f.read()
13
14 setuptools.setup(
15 name='python-magic',
16 description='File type identification using libmagic',
17 author='Adam Hupp',
18 author_email='[email protected]',
19 url="http://github.com/ahupp/python-magic",
20 version='0.4.20',
21 long_description=read('README.md'),
22 long_description_content_type='text/markdown',
23 packages=['magic'],
24 keywords="mime magic file",
25 license="MIT",
26 python_requires='>=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*',
27 classifiers=[
28 'Intended Audience :: Developers',
29 'License :: OSI Approved :: MIT License',
30 'Programming Language :: Python',
31 'Programming Language :: Python :: 2.7',
32 'Programming Language :: Python :: 3',
33 'Programming Language :: Python :: 3.5',
34 'Programming Language :: Python :: 3.6',
35 'Programming Language :: Python :: 3.7',
36 'Programming Language :: Python :: 3.8',
37 'Programming Language :: Python :: 3.9',
38 'Programming Language :: Python :: Implementation :: CPython',
39 ],
40 )
41
+0
-3
stdeb.cfg less more
0 [python-magic]
1 Depends: libmagic1
2 Conflicts: python-magic
0 # coding: utf-8
1
2 import unittest
3 import os
4 import magic
5
6 # magic_descriptor is broken (?) in centos 7, so don't run those tests
7 SKIP_FROM_DESCRIPTOR = bool(os.environ.get('SKIP_FROM_DESCRIPTOR'))
8
9 class MagicTestCase(unittest.TestCase):
10 filename = 'testdata/test.pdf'
11 expected_mime_type = 'application/pdf'
12 expected_encoding = 'us-ascii'
13 expected_name = 'PDF document, version 1.2'
14
15 def assert_result(self, result):
16 self.assertEqual(result.mime_type, self.expected_mime_type)
17 self.assertEqual(result.encoding, self.expected_encoding)
18 self.assertEqual(result.name, self.expected_name)
19
20 def test_detect_from_filename(self):
21 result = magic.detect_from_filename(self.filename)
22 self.assert_result(result)
23
24 def test_detect_from_fobj(self):
25
26 if SKIP_FROM_DESCRIPTOR:
27 self.skipTest("magic_descriptor is broken in this version of libmagic")
28
29
30 with open(self.filename) as fobj:
31 result = magic.detect_from_fobj(fobj)
32 self.assert_result(result)
33
34 def test_detect_from_content(self):
35 # differ from upstream by opening file in binary mode,
36 # this avoids hitting a bug in python3+libfile bindings
37 # see https://github.com/ahupp/python-magic/issues/152
38 # for a similar issue
39 with open(self.filename, 'rb') as fobj:
40 result = magic.detect_from_content(fobj.read(4096))
41 self.assert_result(result)
42
43
44 if __name__ == '__main__':
45 unittest.main()
0 import subprocess
1 import os.path
2 import sys
3
4 this_dir = os.path.dirname(sys.argv[0])
5
6 new_env = dict(os.environ)
7 new_env.update({
8 'LC_ALL': 'en_US.UTF-8',
9 'PYTHONPATH': os.path.join(this_dir, ".."),
10 })
11
12
13 def has_py(version):
14 ret = subprocess.run("which %s" % version, shell=True, stdout=subprocess.DEVNULL)
15 return ret.returncode == 0
16
17
18 def run_test(versions):
19 found = False
20 for i in versions:
21 if not has_py(i):
22 # if this version doesn't exist in path, skip
23 continue
24 found = True
25 print("Testing %s" % i)
26 subprocess.run([i, os.path.join(this_dir, "test.py")], env=new_env, check=True)
27 subprocess.run([i, os.path.join(this_dir, "libmagic_test.py")], env=new_env, check=True)
28
29 if not found:
30 sys.exit("No versions found: " + str(versions))
31
32 run_test(["python2", "python2.7"])
33 run_test(["python3.5", "python3.6", "python3.7", "python3.8", "python3.9"])
34
+0
-12
test/run.sh less more
0 #!/bin/sh
1
2 set -e
3
4 # ensure we can use unicode filenames in the test
5 export LC_ALL=en_US.UTF-8
6 THISDIR=`dirname $0`
7 export PYTHONPATH=${THISDIR}/..
8
9 python2.6 ${THISDIR}/test.py
10 python2.7 ${THISDIR}/test.py
11 python3 ${THISDIR}/test.py
0 import os, sys
0 import os
1
12 # for output which reports a local time
23 os.environ['TZ'] = 'GMT'
4
5 if os.environ.get('LC_ALL', '') != 'en_US.UTF-8':
6 # this ensure we're in a utf-8 default filesystem encoding which is
7 # necessary for some tests
8 raise Exception("must run `export LC_ALL=en_US.UTF-8` before running test suite")
9
310 import shutil
411 import os.path
512 import unittest
613
714 import magic
15 import sys
16
17 # magic_descriptor is broken (?) in centos 7, so don't run those tests
18 SKIP_FROM_DESCRIPTOR = bool(os.environ.get('SKIP_FROM_DESCRIPTOR'))
819
920 class MagicTest(unittest.TestCase):
10 TESTDATA_DIR = os.path.join(os.path.dirname(__file__), 'testdata')
11
12 def assert_values(self, m, expected_values):
21 TESTDATA_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'testdata')
22
23 def test_version(self):
24 try:
25 self.assertTrue(magic.version() > 0)
26 except NotImplementedError:
27 pass
28
29 def test_fs_encoding(self):
30 self.assertEqual('utf-8', sys.getfilesystemencoding().lower())
31
32 def assert_values(self, m, expected_values, buf_equals_file=True):
1333 for filename, expected_value in expected_values.items():
1434 try:
1535 filename = os.path.join(self.TESTDATA_DIR, filename)
1636 except TypeError:
17 filename = os.path.join(self.TESTDATA_DIR.encode('utf-8'), filename)
18
19
37 filename = os.path.join(
38 self.TESTDATA_DIR.encode('utf-8'), filename)
39
2040 if type(expected_value) is not tuple:
2141 expected_value = (expected_value,)
2242
23 for i in expected_value:
24 with open(filename, 'rb') as f:
25 buf_value = m.from_buffer(f.read())
26
27 file_value = m.from_file(filename)
28 if buf_value == i and file_value == i:
29 break
30 else:
31 self.assertTrue(False, "no match for " + repr(expected_value))
32
43 with open(filename, 'rb') as f:
44 buf_value = m.from_buffer(f.read())
45
46 file_value = m.from_file(filename)
47
48 if buf_equals_file:
49 self.assertEqual(buf_value, file_value)
50
51 for value in (buf_value, file_value):
52 self.assertIn(value, expected_value)
53
54 def test_from_file_str_and_bytes(self):
55 filename = os.path.join(self.TESTDATA_DIR, "test.pdf")
56
57 self.assertEqual('application/pdf',
58 magic.from_file(filename, mime=True))
59 self.assertEqual('application/pdf',
60 magic.from_file(filename.encode('utf-8'), mime=True))
61
62 def test_from_descriptor_str_and_bytes(self):
63 if SKIP_FROM_DESCRIPTOR:
64 self.skipTest("magic_descriptor is broken in this version of libmagic")
65
66 filename = os.path.join(self.TESTDATA_DIR, "test.pdf")
67 with open(filename) as f:
68 self.assertEqual('application/pdf',
69 magic.from_descriptor(f.fileno(), mime=True))
70 self.assertEqual('application/pdf',
71 magic.from_descriptor(f.fileno(), mime=True))
72
73 def test_from_buffer_str_and_bytes(self):
74 if SKIP_FROM_DESCRIPTOR:
75 self.skipTest("magic_descriptor is broken in this version of libmagic")
76 m = magic.Magic(mime=True)
77
78 self.assertTrue(
79 m.from_buffer('#!/usr/bin/env python\nprint("foo")')
80 in ("text/x-python", "text/x-script.python"))
81 self.assertTrue(
82 m.from_buffer(b'#!/usr/bin/env python\nprint("foo")')
83 in ("text/x-python", "text/x-script.python"))
84
3385 def test_mime_types(self):
34 dest = os.path.join(MagicTest.TESTDATA_DIR, b'\xce\xbb'.decode('utf-8'))
86 dest = os.path.join(MagicTest.TESTDATA_DIR,
87 b'\xce\xbb'.decode('utf-8'))
3588 shutil.copyfile(os.path.join(MagicTest.TESTDATA_DIR, 'lambda'), dest)
3689 try:
3790 m = magic.Magic(mime=True)
3891 self.assert_values(m, {
39 'magic.pyc': 'application/octet-stream',
92 'magic._pyc_': ('application/octet-stream', 'text/x-bytecode.python'),
4093 'test.pdf': 'application/pdf',
41 'test.gz': 'application/gzip',
94 'test.gz': ('application/gzip', 'application/x-gzip'),
95 'test.snappy.parquet': 'application/octet-stream',
4296 'text.txt': 'text/plain',
4397 b'\xce\xbb'.decode('utf-8'): 'text/plain',
4498 b'\xce\xbb': 'text/plain',
48102
49103 def test_descriptions(self):
50104 m = magic.Magic()
51 os.environ['TZ'] = 'UTC' # To get the last modified date of test.gz in UTC
105 os.environ['TZ'] = 'UTC' # To get last modified date of test.gz in UTC
52106 try:
53107 self.assert_values(m, {
54 'magic.pyc': 'python 2.4 byte-compiled',
108 'magic._pyc_': 'python 2.4 byte-compiled',
55109 'test.pdf': 'PDF document, version 1.2',
56110 'test.gz':
57 ('gzip compressed data, was "test", from Unix, last modified: Sun Jun 29 01:32:52 2008',
58 'gzip compressed data, was "test", last modified: Sun Jun 29 01:32:52 2008, from Unix'),
111 ('gzip compressed data, was "test", from Unix, last '
112 'modified: Sun Jun 29 01:32:52 2008',
113 'gzip compressed data, was "test", last modified'
114 ': Sun Jun 29 01:32:52 2008, from Unix',
115 'gzip compressed data, was "test", last modified'
116 ': Sun Jun 29 01:32:52 2008, from Unix, original size 15',
117 'gzip compressed data, was "test", '
118 'last modified: Sun Jun 29 01:32:52 2008, '
119 'from Unix, original size modulo 2^32 15',
120 'gzip compressed data, was "test", last modified'
121 ': Sun Jun 29 01:32:52 2008, from Unix, truncated'
122 ),
59123 'text.txt': 'ASCII text',
124 'test.snappy.parquet': ('Apache Parquet', 'Par archive data'),
125 }, buf_equals_file=False)
126 finally:
127 del os.environ['TZ']
128
129 def test_extension(self):
130 try:
131 m = magic.Magic(extension=True)
132 self.assert_values(m, {
133 # some versions return '' for the extensions of a gz file,
134 # including w/ the command line. Who knows...
135 'test.gz': ('gz/tgz/tpz/zabw/svgz', '', '???'),
136 'name_use.jpg': 'jpeg/jpg/jpe/jfif',
60137 })
61 finally:
62 del os.environ['TZ']
138 except NotImplementedError:
139 self.skipTest('MAGIC_EXTENSION not supported in this version')
140
141 def test_unicode_result_nonraw(self):
142 m = magic.Magic(raw=False)
143 src = os.path.join(MagicTest.TESTDATA_DIR, 'pgpunicode')
144 result = m.from_file(src)
145 # NOTE: This check is added as otherwise some magic files don't identify the test case as a PGP key.
146 if 'PGP' in result:
147 assert r"PGP\011Secret Sub-key -" == result
148 else:
149 raise unittest.SkipTest("Magic file doesn't return expected type.")
150
151 def test_unicode_result_raw(self):
152 m = magic.Magic(raw=True)
153 src = os.path.join(MagicTest.TESTDATA_DIR, 'pgpunicode')
154 result = m.from_file(src)
155 if 'PGP' in result:
156 assert b'PGP\tSecret Sub-key -' == result.encode('utf-8')
157 else:
158 raise unittest.SkipTest("Magic file doesn't return expected type.")
63159
64160 def test_mime_encodings(self):
65161 m = magic.Magic(mime_encoding=True)
84180
85181 m = magic.Magic(mime=True)
86182 self.assertEqual(m.from_file(filename), 'image/jpeg')
87
88 m = magic.Magic(mime=True, keep_going=True)
89 self.assertEqual(m.from_file(filename), 'image/jpeg')
90
183
184 try:
185 # this will throw if you have an "old" version of the library
186 # I'm otherwise not sure how to query if keep_going is supported
187 magic.version()
188 m = magic.Magic(mime=True, keep_going=True)
189 self.assertEqual(m.from_file(filename),
190 'image/jpeg\\012- application/octet-stream')
191 except NotImplementedError:
192 pass
91193
92194 def test_rethrow(self):
93195 old = magic.magic_buffer
94196 try:
95 def t(x,y):
197 def t(x, y):
96198 raise magic.MagicException("passthrough")
199
97200 magic.magic_buffer = t
98
99 self.assertRaises(magic.MagicException, magic.from_buffer, "hello", True)
201
202 with self.assertRaises(magic.MagicException):
203 magic.from_buffer("hello", True)
100204 finally:
101205 magic.magic_buffer = old
206
207 def test_getparam(self):
208 m = magic.Magic(mime=True)
209 try:
210 m.setparam(magic.MAGIC_PARAM_INDIR_MAX, 1)
211 self.assertEqual(m.getparam(magic.MAGIC_PARAM_INDIR_MAX), 1)
212 except NotImplementedError:
213 pass
214
215 def test_name_count(self):
216 m = magic.Magic()
217 with open(os.path.join(self.TESTDATA_DIR, 'name_use.jpg'), 'rb') as f:
218 m.from_buffer(f.read())
219
220
102221 if __name__ == '__main__':
103222 unittest.main()
test/testdata/keep-going.jpg less more
Binary diff not shown
+0
-1
test/testdata/lambda less more
0 test
test/testdata/magic.pyc less more
Binary diff not shown
test/testdata/test.gz less more
Binary diff not shown
+0
-199
test/testdata/test.pdf less more
0 %PDF-1.2
1 7 0 obj
2 [5 0 R/XYZ 111.6 757.86]
3 endobj
4 13 0 obj
5 <<
6 /Title(About this document)
7 /A<<
8 /S/GoTo
9 /D(subsection.1.1)
10 >>
11 /Parent 12 0 R
12 /Next 14 0 R
13 >>
14 endobj
15 15 0 obj
16 <<
17 /Title(Compiling with GHC)
18 /A<<
19 /S/GoTo
20 /D(subsubsection.1.2.1)
21 >>
22 /Parent 14 0 R
23 /Next 16 0 R
24 >>
25 endobj
26 16 0 obj
27 <<
28 /Title(Compiling with Hugs)
29 /A<<
30 /S/GoTo
31 /D(subsubsection.1.2.2)
32 >>
33 /Parent 14 0 R
34 /Prev 15 0 R
35 >>
36 endobj
37 14 0 obj
38 <<
39 /Title(Compatibility)
40 /A<<
41 /S/GoTo
42 /D(subsection.1.2)
43 >>
44 /Parent 12 0 R
45 /Prev 13 0 R
46 /First 15 0 R
47 /Last 16 0 R
48 /Count -2
49 /Next 17 0 R
50 >>
51 endobj
52 17 0 obj
53 <<
54 /Title(Reporting bugs)
55 /A<<
56 /S/GoTo
57 /D(subsection.1.3)
58 >>
59 /Parent 12 0 R
60 /Prev 14 0 R
61 /Next 18 0 R
62 >>
63 endobj
64 18 0 obj
65 <<
66 /Title(History)
67 /A<<
68 /S/GoTo
69 /D(subsection.1.4)
70 >>
71 /Parent 12 0 R
72 /Prev 17 0 R
73 /Next 19 0 R
74 >>
75 endobj
76 19 0 obj
77 <<
78 /Title(License)
79 /A<<
80 /S/GoTo
81 /D(subsection.1.5)
82 >>
83 /Parent 12 0 R
84 /Prev 18 0 R
85 >>
86 endobj
87 12 0 obj
88 <<
89 /Title(Introduction)
90 /A<<
91 /S/GoTo
92 /D(section.1)
93 >>
94 /Parent 11 0 R
95 /First 13 0 R
96 /Last 19 0 R
97 /Count -5
98 /Next 20 0 R
99 >>
100 endobj
101 21 0 obj
102 <<
103 /Title(Running a parser)
104 /A<<
105 /S/GoTo
106 /D(subsection.2.1)
107 >>
108 /Parent 20 0 R
109 /Next 22 0 R
110 >>
111 endobj
112 22 0 obj
113 <<
114 /Title(Sequence and choice)
115 /A<<
116 /S/GoTo
117 /D(subsection.2.2)
118 >>
119 /Parent 20 0 R
120 /Prev 21 0 R
121 /Next 23 0 R
122 >>
123 endobj
124 23 0 obj
125 <<
126 /Title(Predictive parsers)
127 /A<<
128 /S/GoTo
129 /D(subsection.2.3)
130 >>
131 /Parent 20 0 R
132 /Prev 22 0 R
133 /Next 24 0 R
134 >>
135 endobj
136 24 0 obj
137 <<
138 /Title(Adding semantics)
139 /A<<
140 /S/GoTo
141 /D(subsection.2.4)
142 >>
143 /Parent 20 0 R
144 /Prev 23 0 R
145 /Next 25 0 R
146 >>
147 endobj
148 25 0 obj
149 <<
150 /Title(Sequences and seperators)
151 /A<<
152 /S/GoTo
153 /D(subsection.2.5)
154 >>
155 /Parent 20 0 R
156 /Prev 24 0 R
157 /Next 26 0 R
158 >>
159 endobj
160 26 0 obj
161 <<
162 /Title(Improving error messages)
163 /A<<
164 /S/GoTo
165 /D(subsection.2.6)
166 >>
167 /Parent 20 0 R
168 /Prev 25 0 R
169 /Next 27 0 R
170 >>
171 endobj
172 27 0 obj
173 <<
174 /Title(Expressions)
175 /A<<
176 /S/GoTo
177 /D(subsection.2.7)
178 >>
179 /Parent 20 0 R
180 /Prev 26 0 R
181 /Next 28 0 R
182 >>
183 endobj
184 28 0 obj
185 <<
186 /Title(Lexical analysis)
187 /A<<
188 /S/GoTo
189 /D(subsection.2.8)
190 >>
191 /Parent 20 0 R
192 /Prev 27 0 R
193 /Next 29 0 R
194 >>
195 endobj
196 30 0 obj
197 <<
198 /Title(Lexeme parsers
+0
-2
test/testdata/text-iso8859-1.txt less more
0 This is a web page encoded in iso-8859-1
1 יטאשפגןמ
+0
-2
test/testdata/text.txt less more
0 Hello, World!
1