monitoring#
Measuring Computing Time#
measure_time#
- onnxrt_backend_dev.monitoring.benchmark.measure_time(stmt: str | Callable, context: Dict[str, Any] | None = None, repeat: int = 10, number: int = 50, warmup: int = 1, div_by_number: bool = True, max_time: float | None = None) Dict[str, str | int | float] [source]#
Measures a statement and returns the results as a dictionary.
- Parameters:
stmt – string or callable
context – variable to know in a dictionary
repeat – average over repeat experiment
number – number of executions in one row
warmup – number of iteration to do before starting the real measurement
div_by_number – divide by the number of executions
max_time – execute the statement until the total goes beyond this time (approximatively), repeat is ignored, div_by_number must be set to True
- Returns:
dictionary
<<<
from pprint import pprint from math import cos from onnxrt_backend_dev.monitoring.benchmark import measure_time res = measure_time(lambda: cos(0.5)) pprint(res)
>>>
{'average': 9.11999986783485e-08, 'context_size': 64, 'deviation': 2.56125002230228e-09, 'max_exec': 9.800000043469481e-08, 'min_exec': 8.799999704933726e-08, 'number': 50, 'repeat': 10, 'ttime': 9.11999986783485e-07, 'warmup_time': 9.40000018090359e-06}
See Timer.repeat for a better understanding of parameter repeat and number. The function returns a duration corresponding to number times the execution of the main statement.
Monitoring Memory#
start_spying_on#
- onnxrt_backend_dev.monitoring.memory_peak.start_spying_on(pid: int | None = None, delay: float = 0.01, cuda: bool = False) MemorySpy [source]#
Starts the memory spy. The function starts another process spying on the one sent as an argument.
- Parameters:
pid – process id to spy or the the current one.
delay – delay between two measures.
cuda – True or False to get memory for cuda devices
Example:
.. code-block:: python
from onnxrt_backend_dev.monitoring.memory_peak import start_spying_on
p = start_spying_on() # … # code to measure # … stat = p.stop() print(stat)
MemorySpy#
- class onnxrt_backend_dev.monitoring.memory_peak.MemorySpy(pid: int, delay: float = 0.01, cuda: bool = False)[source]#
Information about the spy. It class method start. Method stop can be called to end the measure.
- Parameters:
pid – process id of the process to spy on
delay – spy on every delay seconds
cuda – enable cuda monitoring
Monitor#
Profiling#
ProfileNode#
- class onnxrt_backend_dev.monitoring.profiling.ProfileNode(filename: str, line: int, func_name: str, nc1: int, nc2: int, tin: float, tall: float)[source]#
Graph structure to represent a profiling.
- Parameters:
filename – filename
line – line number
func_name – function name
nc1 – number of calls 1
nc2 – number of calls 2
tin – time spent in the function
tout – time spent in the function and in the sub functions
- add_called_by(pnode: ProfileNode)[source]#
This function is called by these lines.
- add_calls_to(pnode: ProfileNode, time_elements)[source]#
This function calls this node.
- as_dict(filter_node=None, sort_key=SortKey.LINE)[source]#
Renders the results of a profiling interpreted with function @fn profile2graph. It can then be loaded with a dataframe.
- Parameters:
filter_node – display only the nodes for which this function returns True, if None, the default function removes built-in function with small impact
sort_key – sort sub nodes by…
- Returns:
rows
- static filter_node_(node, info=None) bool [source]#
Filters out node to be displayed by default.
- Parameters:
node – node
info – if the node is called by a function, this dictionary can be used to overwrite the attributes held by the node
- Returns:
boolean (True to keep, False to forget)
- property key#
Returns file:line.
- to_json(filter_node=None, sort_key=SortKey.LINE, as_str=True, **kwargs) str | Dict[str, Any] [source]#
Renders the results of a profiling interpreted with function @fn profile2graph as JSON.
- Parameters:
filter_node – display only the nodes for which this function returns True, if None, the default function removes built-in function with small impact
sort_key – sort sub nodes by…
as_str – converts the json into a string
kwargs – see
json.dumps()
- Returns:
rows
- to_text(filter_node=None, sort_key=SortKey.LINE, fct_width=60) str [source]#
Prints the profiling to text.
- Parameters:
filter_node – display only the nodes for which this function returns True, if None, the default function removes built-in function with small impact
sort_key – sort sub nodes by…
- Returns:
rows
profile#
- onnxrt_backend_dev.monitoring.profiling.profile(fct: Callable, sort: str = 'cumulative', rootrem: str | None = None, as_df: bool = False, return_results: bool = False, **kwargs) Tuple[Stats, Any] | Tuple[Stats, Any, Any] [source]#
Profiles the execution of a function.
- Parameters:
fct – function to profile
sort – see sort_stats
rootrem – root to remove in filenames
as_df – return the results as a dataframe and not text
return_results – if True, return results as well (in the first position)
kwargs – additional parameters used to create the profiler, see cProfile.Profile
- Returns:
raw results, statistics text dump (or dataframe is as_df is True)
(
Source code
,png
,hires.png
,pdf
)
profile2graph#
- onnxrt_backend_dev.monitoring.profiling.profile2graph(ps: Stats, clean_text: Callable | None = None, verbose: bool = False) Tuple[Any, Dict[Any, ProfileNode]] [source]#
Converts profiling statistics into a graphs.
- Parameters:
ps – an instance of pstats
clean_text – function to clean function names
verbose – verbosity
- Returns:
an instance of
ProfileNode
pyinstrument has a nice display to show time spent and call stack at the same time. This function tries to replicate that display based on the results produced by module
cProfile
. Here is an example.<<<
import time from onnxrt_backend_dev.monitoring.profiling import profile, profile2graph def fct0(t): time.sleep(t) def fct1(t): time.sleep(t) def fct2(): fct1(0.1) fct1(0.01) def fct3(): fct0(0.2) fct1(0.5) def fct4(): fct2() fct3() ps = profile(fct4)[0] root, nodes = profile2graph(ps, clean_text=lambda x: x.split("/")[-1]) text = root.to_text() print(text)
>>>
fct1 -- 3 3 -- 0.00004 0.61213 -- :11:fct1 (fct1) <built-in method time.sleep> -- 3 3 -- 0.61209 0.61209 -- ~:0:<built-in method time.sleep> (<built-in method time.sleep>) +++ fct4 -- 1 1 -- 0.00001 0.81307 -- :25:fct4 (fct4) fct2 -- 1 1 -- 0.00001 0.11089 -- :15:fct2 (fct2) fct1 -- 2 2 -- 0.00001 0.11089 -- :11:fct1 (fct1) +++ fct3 -- 1 1 -- 0.00003 0.70216 -- :20:fct3 (fct3) fct0 -- 1 1 -- 0.00001 0.20089 -- :7:fct0 (fct0) <built-in method time.sleep> -- 1 1 -- 0.20087 0.20087 -- ~:0:<built-in method time.sleep> (<built-in method time.sleep>) +++ fct1 -- 1 1 -- 0.00002 0.50125 -- :11:fct1 (fct1) +++ <built-in method time.sleep> -- 4 4 -- 0.81297 0.81297 -- ~:0:<built-in method time.sleep> (<built-in method time.sleep>)